GitHub - Westenets/livekit-python-agents-examples: Comprehensive collection of examples for LiveKit Agents with Python

Runnable examples for building voice, video, and telephony agents using LiveKit Agents

Overview

This repository contains everything you need to learn and build production-ready voice AI agents using LiveKit Agents. From single-file quickstarts to multi-agent orchestration systems with companion frontends, these examples demonstrate real-world patterns and best practices.

python-agents-examples/
├── docs/examples/          # 50+ focused, single-concept demos
└── complex-agents/         # 20+ production-style applications with frontends

Every example includes YAML frontmatter metadata (title, category, tags, difficulty, description) for easy discovery by both humans and tooling.

Quick Start

Prerequisites

Requirement	Version	Notes
Python	3.10+	Required
pip / uv	Latest	Package management
LiveKit Account	—	Sign up free
Node.js	18+	Only for frontend demos
pnpm	Latest	Only for frontend demos

Installation

# Clone the repository
git clone https://github.com/livekit-examples/python-agents-examples.git
cd python-agents-examples

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Environment Setup

Create a .env file in the repository root:

# Required - LiveKit credentials
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

# Provider keys (add as needed for specific examples)
OPENAI_API_KEY=sk-...
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=...
ELEVENLABS_API_KEY=...
ANTHROPIC_API_KEY=...

Run Your First Agent

# Start an interactive voice session
python docs/examples/listen_and_respond/listen_and_respond.py console

The console argument opens an interactive terminal session where you can speak or type with the agent.

Examples by Category

Fundamentals

Start here to understand core agent concepts.

Example	Description	Difficulty
Listen and Respond	The simplest voice agent—listens and responds	Beginner
Tool Calling	Add function tools agents can invoke	Beginner
Context Variables	Inject user context into agent instructions	Beginner
Playing Audio	Play audio files within an agent	Beginner
Repeater	Echo back exactly what the user says	Beginner
Uninterruptable	Complete responses without interruptions	Beginner
Exit Message	Handle graceful session endings	Beginner

Multi-Agent Systems

Build complex workflows with multiple specialized agents.

Example	Description	Difficulty
Agent Transfer	Switch between agents mid-call using function tools	Intermediate
Medical Office Triage	Multi-department routing with context preservation	Advanced
Personal Shopper	E-commerce with triage, sales, and returns agents	Advanced
Doheny Surf Desk	Phone booking system with background observer agent	Advanced

Telephony & SIP

Voice AI for phone systems.

Example	Description	Difficulty
Answer Call	Basic inbound call handling	Beginner
Make Call	Outbound calling via SIP trunks	Beginner
Warm Handoff	Transfer calls to human agents	Intermediate
SIP Lifecycle	Complete call lifecycle management	Advanced
Survey Caller	Automated surveys with CSV data collection	Intermediate
IVR Navigator	Navigate phone menus using DTMF	Advanced

Pipeline Customization

Intercept and modify the STT → LLM → TTS pipeline.

Example	Description	Difficulty
Simple Content Filter	Keyword-based output filtering	Beginner
LLM Content Filter	Dual-LLM moderation system	Advanced
TTS Node Override	Custom text replacements before speech	Intermediate
Transcription Node	Modify transcriptions before LLM	Intermediate
Short Replies Only	Interrupt verbose responses	Beginner
LLM Output Replacement	Strip thinking tags from reasoning models	Intermediate

Vision & Multimodal

Agents that can see.

Example	Description	Difficulty
Gemini Live Vision	Real-time vision with Gemini 2.0	Beginner
Vision Agent	Camera vision with Grok-2 Vision	Intermediate
Moondream Vision	Add vision to non-vision LLMs	Intermediate

Avatars & Visual Agents

Bring your agent to life with animated avatars.

Example	Description	Difficulty
Hedra Pipeline Avatar	Static image avatar with pipeline architecture	Intermediate
Hedra Realtime Avatar	OpenAI Realtime + Hedra avatar	Intermediate
Dynamic Avatar	Create avatars on-the-fly	Intermediate
Education Avatar	Teaching avatar with flash cards via RPC	Advanced
Tavus Avatar	Tavus-powered avatar assistant	Intermediate

Translation & Multilingual

Break language barriers.

Example	Description	Difficulty
Pipeline Translator	English → French voice translation	Intermediate
TTS Translator	Advanced translation with Gladia code-switching	Advanced
Change Language	Dynamic language switching via function tools	Intermediate

Metrics & Observability

Monitor and debug your agents.

Example	Description	Difficulty
LLM Metrics	Token counts, TTFT, throughput	Beginner
STT Metrics	Transcription timing and errors	Beginner
TTS Metrics	Speech synthesis performance	Beginner
VAD Metrics	Voice activity detection stats	Beginner
Langfuse Tracing	Full session tracing with Langfuse	Intermediate

Events & State

React to conversation events and manage state.

Example	Description	Difficulty
Basic Events	Register event listeners with on/off/once	Beginner
Event Emitters	Custom event handling patterns	Beginner
Conversation Monitoring	Log and inspect conversation events	Beginner
State Tracking	Complex NPC state with rapport system	Advanced
RPC State Management	CRUD operations over RPC	Advanced

Advanced Integrations

Connect to external services.

Example	Description	Difficulty
MCP Client (stdio)	Connect to local MCP servers	Beginner
MCP Client (HTTP)	Connect to remote MCP servers	Beginner
Home Automation	Control smart home devices	Intermediate
RAG Voice Agent	Vector search with Annoy + embeddings	Advanced
Shopify Voice	Voice shopping with MCP + Shopify	Advanced

Full Applications

These full-stack applications include both backend agents and React frontends.

🎮 Dungeons & Agents

Voice-driven D&D RPG with narrator/combat agents, character progression, and turn-based combat.

cd complex-agents/role-playing
python agent.py dev

# In another terminal
cd role_playing_frontend && pnpm install && pnpm dev

Features: Multi-agent switching, dice mechanics, NPC generation, inventory system, combat AI

📞 Doheny Surf Desk

Phone booking system with background observer agent and task groups.

cd complex-agents/doheny-surf-desk
python agent.py dev

Features: 5 specialized agents, LLM-based guardrails, sequential tasks, context injection

🔬 EXA Deep Researcher

Voice-controlled research agent using EXA for web intelligence.

cd complex-agents/exa-deep-researcher
python agent.py dev

# In another terminal  
cd frontend && pnpm install && pnpm dev

Features: Background research jobs, RPC streaming, cited reports

🏥 Medical Office Triage

Multi-department medical system with agent transfers.

cd complex-agents/medical_office_triage
python triage.py dev

Features: Triage → Specialist routing, chat history preservation, YAML prompts

🍔 Drive-Thru

Fast food ordering system with menu management.

cd complex-agents/drive-thru/drive-thru-agent
python agent.py dev

# In another terminal
cd ../frontend && pnpm install && pnpm dev

📝 Nova Sonic Form Agent

Job application interview with AWS Realtime.

cd complex-agents/nova-sonic
python form_agent.py dev

# In another terminal
cd nova-sonic-form-agent && pnpm install && pnpm dev

Features: AWS Realtime model, structured data collection, live form updates

Provider Support

Examples demonstrate integration with these providers:

Category	Providers
LLM	OpenAI, Anthropic, Google Gemini, Groq, Cerebras, AWS Bedrock, X.AI
STT	Deepgram, AssemblyAI, Gladia, Cartesia
TTS	Cartesia, ElevenLabs, Rime, PlayAI, Inworld, OpenAI
VAD	Silero
Avatar	Hedra, Tavus
Vision	OpenAI GPT-4V, Google Gemini, X.AI Grok, Moondream
Realtime	OpenAI Realtime, Google Gemini Live, AWS Nova Sonic

Discovery Tools

Browse the Index

The complete catalog lives in docs/index.yaml with metadata for every example:

- file_path: docs/examples/tool_calling/tool_calling.py
  title: Tool Calling
  category: basics
  tags: [tool-calling, deepgram, openai, cartesia]
  difficulty: beginner
  description: Shows how to use tool calling in an agent.
  demonstrates:
    - Using the most basic form of tool calling

Find Examples by Tag

# Find all telephony examples
rg "tags:.*telephony" docs/index.yaml

# Find all advanced examples  
rg "difficulty: advanced" docs/index.yaml

Frontmatter Search

Every Python example starts with YAML frontmatter:

# Find examples using specific providers
rg "tags:.*elevenlabs" -g "*.py"

Testing

The repository includes testing utilities in complex-agents/testing/:

# Basic greeting test
async def test_agent_greeting():
    session = await create_test_session()
    response = await session.generate_reply()
    assert "hello" in response.lower()

Run tests with pytest:

cd complex-agents/testing
pytest -v

Resources

Resource	Link
LiveKit Agents Documentation	docs.livekit.io/agents
LiveKit Agents GitHub	github.com/livekit/agents
LiveKit Cloud	cloud.livekit.io

Contributing

We welcome contributions! Please open an issue or PR if you:

Find a bug or have a suggestion
Want to add a new example
Improve documentation

_{Built with ❤️ by the LiveKit team}