🎵

Complete Audio Studio

Professional audio production suite built into your IPBX. 71 local TTS voices via Coqui across 17 languages (zero cost), Google Cloud TTS with Standard, Wavenet, Neural2 and Studio tiers, speech-to-text via Chirp 3, a visual audio mixer with waveform editing, fade controls and timeline — plus a shared library across the entire company. All included and unlimited.

⚙️How it works

1

Record directly from your browser microphone with real-time waveform visualization and playback controls

2

Generate speech with Coqui TTS: 71 voices across 17 languages (French, English, Spanish, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Hungarian, Korean, Japanese, Hindi) — completely free and local

3

Generate premium speech with Google Cloud TTS: choose from Standard, Wavenet, Neural2 and Studio voice tiers for broadcast-quality output

4

Upload audio files in any common format (MP3, WAV, GSM, OGG, M4A) up to 50 MB per file

5

Use the visual audio mixer to combine voice and music tracks: see waveforms, adjust independent volumes, set fade in/out curves and position tracks precisely on the timeline

6

Speech-to-text transcription via Google Cloud Chirp 3 model for converting existing audio recordings into text

7

Preview any audio file or mix in real-time before saving — hear exactly what your callers will hear

8

All audio files are stored in a shared company library accessible to all authorized users

9

Automatic format conversion ensures every file is compatible with Asterisk regardless of the source format

10

Organize your library with categories, tags and search to quickly find the right file for any dialplan module

💡Use cases

  • Create IVR welcome messages in multiple languages using Coqui TTS — generate all 17 language versions in minutes at zero cost
  • Produce broadcast-quality announcements with Google Cloud Neural2 or Studio voices for premium brand image
  • Mix professional music on hold: combine background music with periodic voice announcements using the waveform mixer
  • Record quick personalized greetings directly from your browser without any audio editing software
  • Build a multi-language audio library for an international company with consistent voice across all languages
  • Create queue announcements with position and wait time messages in multiple languages
  • Transcribe existing audio recordings to text using Chirp 3 for documentation or accessibility
  • Produce seasonal or promotional messages (holiday greetings, special offers) and schedule them via dialplan scheduling

Benefits

  • 71 local TTS voices via Coqui — completely free, unlimited, no external API costs
  • 17 languages supported: French, English, Spanish, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Hungarian, Korean, Japanese, Hindi
  • Google Cloud TTS with 4 quality tiers: Standard, Wavenet, Neural2 and Studio for premium voice output
  • Speech-to-text via Google Cloud Chirp 3 model for audio transcription
  • Visual audio mixer with waveform display, independent volume controls, fade curves and timeline positioning
  • Shared company-wide audio library with categories, tags and search
  • Browser-based microphone recording with real-time waveform visualization
  • Automatic format conversion to Asterisk-compatible formats
  • Real-time preview of all audio files and mixes before deployment
  • No file or generation limits on local TTS — create as many audio files as you need

⚙️Configuration

All audio features are accessible from the administration interface. Local TTS (Coqui) works out of the box. Google Cloud TTS requires API credentials for premium voices.

File upload

Upload your audio files from your computer. Supported formats: MP3, WAV, GSM, OGG, M4A (max 50 MB). Files are automatically converted and stored in the company library.

Microphone recording

Record directly from your browser with your microphone. Real-time waveform visualization during recording. The file is automatically converted to WAV format.

TTS generation

Choose your engine: Coqui TTS (71 local voices, 17 languages, free) or Google Cloud TTS (Standard/Wavenet/Neural2/Studio tiers). Select language and voice, then enter your text (max 5000 characters).

Audio mixer

Select a voice track and a music track from your library. Adjust independent volumes, set fade in/out curves, position the voice precisely on the music timeline using the visual waveform editor.

📚Examples

Multilingual IVR: generate welcome messages in French, English, Spanish and German using Coqui TTS (free) — 4 languages in under 5 minutes with no API cost

Premium brand message: use Google Cloud Studio voice to create a broadcast-quality welcome message, then mix it with royalty-free background music in the audio mixer

Music on hold production: upload background music, generate a periodic voice announcement with Neural2 TTS, combine them in the mixer with 3-second fade in/out and 30-second voice positioning

Quick holiday greeting: record a personal message via browser microphone, add festive background music in the mixer, and deploy to your IVR through the dialplan scheduling module

International audio library: create a complete set of announcements (welcome, IVR options, queue messages, voicemail greetings) in all 17 supported languages using Coqui TTS, organized by category in the shared library

Ready to use this feature?

Discover our offers and start using this feature today.

View offers
🍪

Cookie Management

We use cookies to improve your experience and analyze site usage. You can customize your preferences.