> ## Documentation Index > Fetch the complete documentation index at: https://docs.vobiz.ai/llms.txt > Use this file to discover all available pages before exploring further. # Bare-metal XML WebSocket > Build a real-time AI voice agent on Vobiz using only the XML WebSocket streaming primitive - no LiveKit, no Pipecat, no third-party SDK required. Build a real-time AI voice agent using only Vobiz XML WebSocket streaming - no LiveKit, no Pipecat, no third-party SDK. Clone and run the full working example ## Getting started ```bash theme={null} git clone https://github.com/vobiz-ai/Vobiz-All-XML.git cd Vobiz-All-XML pip install -r requirements.txt python server.py ``` ## Overview This example shows the lowest-level integration possible with Vobiz - raw WebSocket audio frames, manual VAD, direct STT/LLM/TTS API calls, and base64 audio encoding back to Vobiz. Use this when you need maximum control and minimum latency with no intermediary layers. ## Architecture ```text theme={null} Caller → Vobiz SIP ↓ XML: ↓ FastAPI WebSocket endpoint ↓ JSON event parsing → base64 decode → G.711 μ-law bytes ↓ Deepgram streaming STT WebSocket (speech → text) ↓ OpenAI ChatCompletions (text → response tokens) ↓ ElevenLabs / OpenAI TTS (tokens → audio bytes) ↓ base64 encode → JSON → WebSocket → Vobiz → Caller ``` ## How it works When an inbound call hits your FastAPI webhook, respond with Vobiz XML instructing Vobiz to open a bidirectional WebSocket to your server. Vobiz sends JSON frames containing base64-encoded G.711 μ-law audio. Decode these frames into raw byte streams. Forward raw audio bytes to Deepgram's streaming WebSocket for real-time transcription. As words are recognized, stream them to the LLM. Send the transcription to OpenAI's ChatCompletions API. Response tokens stream back as they are generated. Synthesize tokens using a TTS engine (ElevenLabs or OpenAI). Base64-encode the resulting audio and send it back over the WebSocket to Vobiz, which plays it to the caller. ## Vobiz XML hook ```xml theme={null} ``` ## When to use this | Use case | Recommendation | | ------------------------- | ------------------------------- | | Maximum latency control | ✅ This example | | Rapid prototyping | Use LiveKit or Pipecat examples | | Custom audio processing | ✅ This example | | Production-ready pipeline | Use LiveKit or Pipecat examples | ## Environment variables ```bash .env theme={null} DEEPGRAM_API_KEY=your-deepgram-key OPENAI_API_KEY=sk-... ELEVENLABS_API_KEY=your-elevenlabs-key HTTP_PORT=8000 PUBLIC_URL=https://your-server.com ```