Foundry's Speech services enable voice-powered AI applications with high-quality transcription and synthesis.
| Service | Function | Key Features |
|---|---|---|
| Speech-to-Text | Transcribe audio to text | Real-time & batch, 100+ languages, custom models |
| Text-to-Speech | Convert text to natural speech | 400+ neural voices, custom voice cloning |
| Voice Live | Real-time speech-to-speech | Fully managed runtime, noise suppression, barge-in (New in 2026) |
| Speaker Recognition | Identify speakers by voice | Verification and identification modes |
Combine Speech services with the Agent Service to build voice-controlled AI assistants. With the 2026 Voice Live integration, this is easier than ever: