Runnable examples for building voice, video, and telephony agents using LiveKit Agents
Overview
This repository contains everything you need to learn and build production-ready voice AI agents using LiveKit Agents. From single-file quickstarts to multi-agent orchestration systems with companion frontends, these examples demonstrate real-world patterns and best practices.
python-agents-examples/
├── docs/examples/ # 50+ focused, single-concept demos
└── complex-agents/ # 20+ production-style applications with frontends
Every example includes YAML frontmatter metadata (title, category, tags, difficulty, description) for easy discovery by both humans and tooling.
Quick Start
Prerequisites
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.10+ | Required |
| pip / uv | Latest | Package management |
| LiveKit Account | — | Sign up free |
| Node.js | 18+ | Only for frontend demos |
| pnpm | Latest | Only for frontend demos |
Installation
# Clone the repository git clone https://github.com/livekit-examples/python-agents-examples.git cd python-agents-examples # Create and activate a virtual environment python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt
Environment Setup
Create a .env file in the repository root:
# Required - LiveKit credentials LIVEKIT_URL=wss://your-project.livekit.cloud LIVEKIT_API_KEY=your_api_key LIVEKIT_API_SECRET=your_api_secret # Provider keys (add as needed for specific examples) OPENAI_API_KEY=sk-... DEEPGRAM_API_KEY=... CARTESIA_API_KEY=... ELEVENLABS_API_KEY=... ANTHROPIC_API_KEY=...
Run Your First Agent
# Start an interactive voice session
python docs/examples/listen_and_respond/listen_and_respond.py consoleThe console argument opens an interactive terminal session where you can speak or type with the agent.
Examples by Category
Fundamentals
Start here to understand core agent concepts.
| Example | Description | Difficulty |
|---|---|---|
| Listen and Respond | The simplest voice agent—listens and responds | Beginner |
| Tool Calling | Add function tools agents can invoke | Beginner |
| Context Variables | Inject user context into agent instructions | Beginner |
| Playing Audio | Play audio files within an agent | Beginner |
| Repeater | Echo back exactly what the user says | Beginner |
| Uninterruptable | Complete responses without interruptions | Beginner |
| Exit Message | Handle graceful session endings | Beginner |
Multi-Agent Systems
Build complex workflows with multiple specialized agents.
| Example | Description | Difficulty |
|---|---|---|
| Agent Transfer | Switch between agents mid-call using function tools | Intermediate |
| Medical Office Triage | Multi-department routing with context preservation | Advanced |
| Personal Shopper | E-commerce with triage, sales, and returns agents | Advanced |
| Doheny Surf Desk | Phone booking system with background observer agent | Advanced |
Telephony & SIP
Voice AI for phone systems.
| Example | Description | Difficulty |
|---|---|---|
| Answer Call | Basic inbound call handling | Beginner |
| Make Call | Outbound calling via SIP trunks | Beginner |
| Warm Handoff | Transfer calls to human agents | Intermediate |
| SIP Lifecycle | Complete call lifecycle management | Advanced |
| Survey Caller | Automated surveys with CSV data collection | Intermediate |
| IVR Navigator | Navigate phone menus using DTMF | Advanced |
Pipeline Customization
Intercept and modify the STT → LLM → TTS pipeline.
| Example | Description | Difficulty |
|---|---|---|
| Simple Content Filter | Keyword-based output filtering | Beginner |
| LLM Content Filter | Dual-LLM moderation system | Advanced |
| TTS Node Override | Custom text replacements before speech | Intermediate |
| Transcription Node | Modify transcriptions before LLM | Intermediate |
| Short Replies Only | Interrupt verbose responses | Beginner |
| LLM Output Replacement | Strip thinking tags from reasoning models | Intermediate |
Vision & Multimodal
Agents that can see.
| Example | Description | Difficulty |
|---|---|---|
| Gemini Live Vision | Real-time vision with Gemini 2.0 | Beginner |
| Vision Agent | Camera vision with Grok-2 Vision | Intermediate |
| Moondream Vision | Add vision to non-vision LLMs | Intermediate |
Avatars & Visual Agents
Bring your agent to life with animated avatars.
| Example | Description | Difficulty |
|---|---|---|
| Hedra Pipeline Avatar | Static image avatar with pipeline architecture | Intermediate |
| Hedra Realtime Avatar | OpenAI Realtime + Hedra avatar | Intermediate |
| Dynamic Avatar | Create avatars on-the-fly | Intermediate |
| Education Avatar | Teaching avatar with flash cards via RPC | Advanced |
| Tavus Avatar | Tavus-powered avatar assistant | Intermediate |
Translation & Multilingual
Break language barriers.
| Example | Description | Difficulty |
|---|---|---|
| Pipeline Translator | English → French voice translation | Intermediate |
| TTS Translator | Advanced translation with Gladia code-switching | Advanced |
| Change Language | Dynamic language switching via function tools | Intermediate |
Metrics & Observability
Monitor and debug your agents.
| Example | Description | Difficulty |
|---|---|---|
| LLM Metrics | Token counts, TTFT, throughput | Beginner |
| STT Metrics | Transcription timing and errors | Beginner |
| TTS Metrics | Speech synthesis performance | Beginner |
| VAD Metrics | Voice activity detection stats | Beginner |
| Langfuse Tracing | Full session tracing with Langfuse | Intermediate |
Events & State
React to conversation events and manage state.
| Example | Description | Difficulty |
|---|---|---|
| Basic Events | Register event listeners with on/off/once | Beginner |
| Event Emitters | Custom event handling patterns | Beginner |
| Conversation Monitoring | Log and inspect conversation events | Beginner |
| State Tracking | Complex NPC state with rapport system | Advanced |
| RPC State Management | CRUD operations over RPC | Advanced |
Advanced Integrations
Connect to external services.
| Example | Description | Difficulty |
|---|---|---|
| MCP Client (stdio) | Connect to local MCP servers | Beginner |
| MCP Client (HTTP) | Connect to remote MCP servers | Beginner |
| Home Automation | Control smart home devices | Intermediate |
| RAG Voice Agent | Vector search with Annoy + embeddings | Advanced |
| Shopify Voice | Voice shopping with MCP + Shopify | Advanced |
Full Applications
These full-stack applications include both backend agents and React frontends.
🎮 Dungeons & Agents
Voice-driven D&D RPG with narrator/combat agents, character progression, and turn-based combat.
cd complex-agents/role-playing python agent.py dev # In another terminal cd role_playing_frontend && pnpm install && pnpm dev
Features: Multi-agent switching, dice mechanics, NPC generation, inventory system, combat AI
📞 Doheny Surf Desk
Phone booking system with background observer agent and task groups.
cd complex-agents/doheny-surf-desk
python agent.py devFeatures: 5 specialized agents, LLM-based guardrails, sequential tasks, context injection
🔬 EXA Deep Researcher
Voice-controlled research agent using EXA for web intelligence.
cd complex-agents/exa-deep-researcher python agent.py dev # In another terminal cd frontend && pnpm install && pnpm dev
Features: Background research jobs, RPC streaming, cited reports
🏥 Medical Office Triage
Multi-department medical system with agent transfers.
cd complex-agents/medical_office_triage
python triage.py devFeatures: Triage → Specialist routing, chat history preservation, YAML prompts
🍔 Drive-Thru
Fast food ordering system with menu management.
cd complex-agents/drive-thru/drive-thru-agent python agent.py dev # In another terminal cd ../frontend && pnpm install && pnpm dev
📝 Nova Sonic Form Agent
Job application interview with AWS Realtime.
cd complex-agents/nova-sonic python form_agent.py dev # In another terminal cd nova-sonic-form-agent && pnpm install && pnpm dev
Features: AWS Realtime model, structured data collection, live form updates
Provider Support
Examples demonstrate integration with these providers:
| Category | Providers |
|---|---|
| LLM | OpenAI, Anthropic, Google Gemini, Groq, Cerebras, AWS Bedrock, X.AI |
| STT | Deepgram, AssemblyAI, Gladia, Cartesia |
| TTS | Cartesia, ElevenLabs, Rime, PlayAI, Inworld, OpenAI |
| VAD | Silero |
| Avatar | Hedra, Tavus |
| Vision | OpenAI GPT-4V, Google Gemini, X.AI Grok, Moondream |
| Realtime | OpenAI Realtime, Google Gemini Live, AWS Nova Sonic |
Discovery Tools
Browse the Index
The complete catalog lives in docs/index.yaml with metadata for every example:
- file_path: docs/examples/tool_calling/tool_calling.py title: Tool Calling category: basics tags: [tool-calling, deepgram, openai, cartesia] difficulty: beginner description: Shows how to use tool calling in an agent. demonstrates: - Using the most basic form of tool calling
Find Examples by Tag
# Find all telephony examples rg "tags:.*telephony" docs/index.yaml # Find all advanced examples rg "difficulty: advanced" docs/index.yaml
Frontmatter Search
Every Python example starts with YAML frontmatter:
# Find examples using specific providers rg "tags:.*elevenlabs" -g "*.py"
Testing
The repository includes testing utilities in complex-agents/testing/:
# Basic greeting test async def test_agent_greeting(): session = await create_test_session() response = await session.generate_reply() assert "hello" in response.lower()
Run tests with pytest:
cd complex-agents/testing
pytest -vResources
| Resource | Link |
|---|---|
| LiveKit Agents Documentation | docs.livekit.io/agents |
| LiveKit Agents GitHub | github.com/livekit/agents |
| LiveKit Cloud | cloud.livekit.io |
Contributing
We welcome contributions! Please open an issue or PR if you:
- Find a bug or have a suggestion
- Want to add a new example
- Improve documentation
Built with ❤️ by the LiveKit team
