Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.vobiz.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Speak Text API uses Vobiz’s text-to-speech (TTS) engine to convert written text into spoken audio during ongoing calls. Deliver dynamic messages, notifications, or instructions without pre-recording audio files. Supports 29 languages and multiple voices for global reach.
Supported: 29 languages with WOMAN and MAN voices (availability varies by language). Default is English (US) with WOMAN voice.

Key features

29 languages

English, Spanish, French, German, Chinese, Japanese, and more major global languages.

Multiple voices

Choose between WOMAN and MAN voices for most languages.

Dynamic content

Generate speech on-the-fly without pre-recording - perfect for personalized messages.

Call leg selection

Choose which participants hear the speech: caller, callee, or both.

Loop support

Repeat text indefinitely for continuous notifications or prompts.

Audio mixing

Mix speech with call audio or mute participants during playback.

Available operations

Speak text

POST - Convert text to speech and play it during an active call with language and voice options.

Stop speaking

DELETE - Stop text-to-speech currently being spoken on a call.

Common use cases

  • Dynamic IVR menus - Generate menu options and prompts based on customer data or time of day.
  • Personalized greetings - Welcome callers by name or deliver customized messages based on caller ID.
  • Queue updates - Announce queue position, wait times, or estimated callback times dynamically.
  • Multi-language support - Deliver messages in the caller’s preferred language for global support.
  • Real-time notifications - Deliver account balances, order statuses, or appointment confirmations on-demand.

Best practices

Keep text concise - Short, clear messages are easier to understand. Break long content into multiple segments if needed.
Use proper punctuation - Punctuation affects speech pacing and tone. Use periods, commas, and question marks for natural-sounding speech.
Avoid special characters - Emojis and special symbols may not be pronounced correctly. Use plain text for best results.