Complete Audio Studio
Professional audio production suite built into your IPBX. 71 local TTS voices via Coqui across 17 languages (zero cost), Google Cloud TTS with Standard, Wavenet, Neural2 and Studio tiers, speech-to-text via Chirp 3, a visual audio mixer with waveform editing, fade controls and timeline — plus a shared library across the entire company. All included and unlimited.
⚙️How it works
Record directly from your browser microphone with real-time waveform visualization and playback controls
Generate speech with Coqui TTS: 71 voices across 17 languages (French, English, Spanish, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Hungarian, Korean, Japanese, Hindi) — completely free and local
Generate premium speech with Google Cloud TTS: choose from Standard, Wavenet, Neural2 and Studio voice tiers for broadcast-quality output
Upload audio files in any common format (MP3, WAV, GSM, OGG, M4A) up to 50 MB per file
Use the visual audio mixer to combine voice and music tracks: see waveforms, adjust independent volumes, set fade in/out curves and position tracks precisely on the timeline
Speech-to-text transcription via Google Cloud Chirp 3 model for converting existing audio recordings into text
Preview any audio file or mix in real-time before saving — hear exactly what your callers will hear
All audio files are stored in a shared company library accessible to all authorized users
Automatic format conversion ensures every file is compatible with Asterisk regardless of the source format
Organize your library with categories, tags and search to quickly find the right file for any dialplan module
💡Use cases
- •Create IVR welcome messages in multiple languages using Coqui TTS — generate all 17 language versions in minutes at zero cost
- •Produce broadcast-quality announcements with Google Cloud Neural2 or Studio voices for premium brand image
- •Mix professional music on hold: combine background music with periodic voice announcements using the waveform mixer
- •Record quick personalized greetings directly from your browser without any audio editing software
- •Build a multi-language audio library for an international company with consistent voice across all languages
- •Create queue announcements with position and wait time messages in multiple languages
- •Transcribe existing audio recordings to text using Chirp 3 for documentation or accessibility
- •Produce seasonal or promotional messages (holiday greetings, special offers) and schedule them via dialplan scheduling
✅Benefits
- ✓71 local TTS voices via Coqui — completely free, unlimited, no external API costs
- ✓17 languages supported: French, English, Spanish, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Hungarian, Korean, Japanese, Hindi
- ✓Google Cloud TTS with 4 quality tiers: Standard, Wavenet, Neural2 and Studio for premium voice output
- ✓Speech-to-text via Google Cloud Chirp 3 model for audio transcription
- ✓Visual audio mixer with waveform display, independent volume controls, fade curves and timeline positioning
- ✓Shared company-wide audio library with categories, tags and search
- ✓Browser-based microphone recording with real-time waveform visualization
- ✓Automatic format conversion to Asterisk-compatible formats
- ✓Real-time preview of all audio files and mixes before deployment
- ✓No file or generation limits on local TTS — create as many audio files as you need
⚙️Configuration
All audio features are accessible from the administration interface. Local TTS (Coqui) works out of the box. Google Cloud TTS requires API credentials for premium voices.
File upload
Upload your audio files from your computer. Supported formats: MP3, WAV, GSM, OGG, M4A (max 50 MB). Files are automatically converted and stored in the company library.
Microphone recording
Record directly from your browser with your microphone. Real-time waveform visualization during recording. The file is automatically converted to WAV format.
TTS generation
Choose your engine: Coqui TTS (71 local voices, 17 languages, free) or Google Cloud TTS (Standard/Wavenet/Neural2/Studio tiers). Select language and voice, then enter your text (max 5000 characters).
Audio mixer
Select a voice track and a music track from your library. Adjust independent volumes, set fade in/out curves, position the voice precisely on the music timeline using the visual waveform editor.
📚Examples
Multilingual IVR: generate welcome messages in French, English, Spanish and German using Coqui TTS (free) — 4 languages in under 5 minutes with no API cost
Premium brand message: use Google Cloud Studio voice to create a broadcast-quality welcome message, then mix it with royalty-free background music in the audio mixer
Music on hold production: upload background music, generate a periodic voice announcement with Neural2 TTS, combine them in the mixer with 3-second fade in/out and 30-second voice positioning
Quick holiday greeting: record a personal message via browser microphone, add festive background music in the mixer, and deploy to your IVR through the dialplan scheduling module
International audio library: create a complete set of announcements (welcome, IVR options, queue messages, voicemail greetings) in all 17 supported languages using Coqui TTS, organized by category in the shared library