Skip to main content

Selecting a Voice

Voice

Selecting a Voice

The voice you choose defines how your agent sounds during phone calls and voice interactions. Votel integrates with three leading voice providers, each with distinct strengths. You can preview, filter, and fine-tune any voice before going live.

Voice Providers

The fastest voice provider available. Cartesia delivers low-latency, natural-sounding speech that keeps conversations feeling smooth and responsive. Recommended as the default choice for most agents.

OpenAI

Consistent and reliable voice output. OpenAI voices offer a stable, professional sound with minimal variation between calls. A solid choice when predictability matters most.

ElevenLabs

The most expressive provider with the widest variety of voices. ElevenLabs excels at conveying emotion and personality. Best for agents where a distinctive, engaging voice is a priority.

Filtering Voices

Use the built-in filters to narrow down your options:

  • Accent -- select from various regional accents
  • Age -- choose younger or more mature-sounding voices
  • Gender -- filter by male, female, or neutral voices

Previewing Voices

Click the play button next to any voice to hear a sample. Listen to several options before making your selection. The preview gives you an accurate sense of how the voice will sound during live conversations.

Voice Settings

After selecting a voice, you can adjust the following settings to fine-tune its behavior:

SettingWhat It Controls
SpeedHow fast the agent speaks. Increase for a brisker pace, decrease for a slower, more deliberate delivery.
StabilityConsistency of tone and pitch across sentences. Higher stability means more uniform delivery; lower stability allows more natural variation.
Clarity + SimilarityHow natural and true-to-sample the voice sounds. Higher values produce cleaner, more faithful output.
Style ExaggerationHow expressive the voice is. Increase for a more animated, emotional delivery; decrease for a neutral, even tone.

All settings come pre-optimized by default. You only need to adjust them if you want a specific effect. For most agents, the defaults work well out of the box.


Next Steps

  • TTS Voices -- detailed reference for all available text-to-speech voices and provider capabilities