ZenTalk Docs
Agents

Voice Configuration

Choose STT, TTS, and LLM providers for your agent's voice pipeline.

The Voice tab configures the three core services in your agent's pipeline.

Speech-to-Text (STT)

Converts the caller's speech into text for the LLM.

ProviderModelsLanguagesNotes
Deepgramnova-3-general30+ languagesLow latency, high accuracy
Sarvamsaarika:v2.5, saaras:v3Indian languagesOptimized for Hindi, Tamil, etc.

Text-to-Speech (TTS)

Converts the LLM's text response into speech.

ProviderModelsVoicesNotes
Deepgramaura-2-helena-enMultiple English voicesNatural sounding, fast
Sarvambulbul:v3-betaanushka (F), shubh (M)Indian language voices

Language Model (LLM)

The brain of your agent — processes the conversation and generates responses.

ProviderModelsNotes
OpenAIgpt-4.1, gpt-4oBest overall quality
Groqllama-3.3-70bFastest inference
Anthropicclaude-sonnet-4.5Strong reasoning
OpenRouterVariousAccess to many models

API Keys

Provider API keys can be configured at two levels:

  1. Global — Set in Settings → Provider Keys (applies to all agents)
  2. Per-agent — Override in the agent's Voice tab (takes priority)

Background Sounds

Add ambient audio during calls for a more natural experience:

SoundDescription
NoneSilent background (default)
OfficeOffice ambiance
CafeCoffee shop atmosphere
RainRain sounds
White NoiseConsistent background noise
NatureBirds, wind, outdoor sounds
KeyboardTyping sounds

Adjust the volume slider (0–100%) to control how loud the background sound is relative to the agent's voice.