<Response> becomes Vobiz’s <Response>, <Say> becomes <Speak>, and the builder — twilio.twiml.voice_response.VoiceResponse() — becomes vobizxml.ResponseElement(). This page maps each voice verb and its attributes, then shows before/after IVRs you can copy.
The builder swap
Twilio’s helper libraries expose aVoiceResponse builder; Vobiz ships the equivalent ResponseElement in the bundled vobizxml module. Nesting, chaining, and serialization all mirror what you already do.
Verb-by-verb mapping
| TwiML verb | Builder (Twilio) | VobizXML verb | Builder (Vobiz) | Notes |
|---|---|---|---|---|
<Say> | response.say(text, voice=, language=, loop=) | <Speak> | add_speak(text, voice=, language=, loop=) | Voice names man/woman map to Vobiz MAN/WOMAN; language and loop carry over 1:1. |
<Play> | response.play(url, loop=) | <Play> | add_play(url, loop=) | Plays MP3/WAV from a URL; loop="0" repeats until the call moves on. |
<Gather> | response.gather(input=, timeout=, num_digits=, finish_on_key=, speech_timeout=, action=) | <Gather> | add_gather(input_type=, execution_timeout=, num_digits=, finish_on_key=, speech_end_timeout=, action=) | input→inputType, timeout→executionTimeout, speechTimeout→speechEndTimeout; numDigits/finishOnKey keep their names. Vobiz adds digitEndTimeout for inter-digit timing. |
<Dial> | response.dial(caller_id=, timeout=, time_limit=, hangup_on_star=, action=) | <Dial> | add_dial(caller_id=, timeout=, time_limit=, hangup_on_star=, action=) | callerId, timeout, timeLimit, hangupOnStar, action all carry over by name. |
<Number> (in <Dial>) | dial.number('+1…') | <Number> | add_number('+1…') | PSTN destination nested in Dial. |
<Client> / <Sip> (in <Dial>) | dial.client('agent') / dial.sip('sip:…') | <User> | add_user('sip:…') | Reach a SIP endpoint or softphone; supports sendDigits and sipHeaders. |
<Record> | response.record(timeout=, finish_on_key=, max_length=, play_beep=, action=) | <Record> | add_record(timeout=, finish_on_key=, max_length=, play_beep=, file_format=, action=) | timeout is the silence limit on both; Vobiz adds fileFormat (mp3/wav) and startOnDialAnswer. |
<Conference> (in <Dial>) | dial.conference('Room 1234') | <Conference> | add_conference('Room 1234') | Named room by text content. On Vobiz Conference is a top-level verb — return it directly from Response. |
<Pause> | response.pause(length=) | <Wait> | add_wait(length=) | Silent pause; Vobiz Wait adds optional silence/minSilence for beep detection. |
<Redirect> | response.redirect(url, method=) | <Redirect> | add_redirect(url, method=) | Hand control to another answer URL that returns fresh XML. |
<Hangup> | response.hangup() | <Hangup> | add_hangup() | Ends the call. |
<Stream> | response.start().stream(url=) / response.connect().stream(url=) | <Stream> | add_stream(url, audio_track=, bidirectional=) | Fork call audio to a wss:// server; track→audioTrack, and bidirectional="true" gives two-way media. |
Before / after: a DTMF IVR menu
A one-key menu.input/timeout/numDigits become inputType/executionTimeout/numDigits, and nested say becomes add_speak.
Before / after: forward a call with caller ID
Twilio’s<Dial> + <Number> maps straight to Vobiz’s <Dial> + <Number>; a <Client>/<Sip> target becomes <User>. callerId and timeout keep their names.
<Client>/<Sip>), nest a <User> instead of <Number>:
Vobiz · Python
Before / after: fork audio to a WebSocket
Twilio’sstart().stream() (one-way) and connect().stream() (two-way) both map to Vobiz’s <Stream> — track becomes audioTrack, and bidirectional="true" requests two-way media.
Key differences
- Same document, renamed verbs. The root stays
<Response>; only leaf names change (Say→Speak,Pause→Wait,Client/Sip→User). Nesting, ordering, and the “return XML from your answer URL” model are identical. Gathergives you two silence timers. Twilio’s singletimeoutbecomes Vobiz’sexecutionTimeout, andspeechTimeoutbecomesspeechEndTimeout. Vobiz adds a dedicateddigitEndTimeoutfor inter-digit pacing, so DTMF and speech end conditions are tuned independently. Full attribute list on the Gather reference.- One
Gatherhandles digits and speech together. SetinputType="dtmf speech"and whichever the caller does first is posted to youractionURL — one verb for menus and open-ended intent capture. Usercovers bothClientandSip. A single<User>element dials SIP endpoints and softphones, carriessendDigitsfor extensions, and sets customX-VH-SIP headers viasipHeaders.Conferenceis a top-level verb. Where Twilio nests<Conference>inside<Dial>, Vobiz returns<Conference>directly from<Response>; the room is created on first join and named by the element’s text content.- Transfers are declarative. Instead of an imperative “modify live call” step, return a
<Redirect>(or new<Dial>) from your answer URL and Vobiz continues the call with the fresh XML — the same webhook-driven pattern your TwiML app already uses. - Streaming is built in.
<Stream>forks audio overwss://withaudioTrackselection andbidirectionaltwo-way media for AI voice-agent pipelines. See the Stream reference for codecs and reconnect options.
Related
- Twilio → Vobiz overview — the full migration order and at-a-glance matrix.
- Gather XML reference — every input attribute and webhook parameter.