Basic LiveKit Voice Agent

Learn how to trace LiveKit voice agents with Noveum Trace

This guide shows you how to trace a LiveKit voice agent using Noveum Trace. You'll learn how to wrap STT/TTS providers and monitor voice interactions.

🎯 Use Case

Drive-Thru Voice Agent: A voice-powered ordering agent that takes customer orders, uses tools to process items, and responds naturally through speech.

🚀 Complete Working Example

import os
import noveum_trace
from noveum_trace.integrations.livekit import (
    LiveKitSTTWrapper,
    LiveKitTTSWrapper,
    setup_livekit_tracing,
    extract_job_context
)
from livekit.agents import Agent, AgentSession, JobContext, function_tool
from livekit.plugins import deepgram, cartesia, openai
 
# Initialize Noveum Trace
noveum_trace.init(
    project="drive-thru-agent",
    api_key=os.getenv("NOVEUM_API_KEY"),
    environment="production"
)
 
# Define a tool for the agent
@function_tool
async def add_item_to_order(item: str, quantity: int = 1) -> str:
    """Add an item to the customer's order."""
    return f"Added {quantity}x {item} to your order"
 
# Create agent with tools
class DriveThruAgent(Agent):
    def __init__(self):
        super().__init__(
            instructions="You are a friendly drive-thru order taker...",
            tools=[add_item_to_order]
        )
 
# Server entrypoint
async def entrypoint(ctx: JobContext):
    session_id = ctx.job.id
    
    # Extract and enrich trace with JobContext metadata
    # This adds room info, participant details, and session context to traces
    job_metadata = await extract_job_context(ctx)
    
    # Wrap STT provider
    traced_stt = LiveKitSTTWrapper(
        stt=deepgram.STT(model="nova-2", language="en-US"),
        session_id=session_id
    )
    
    # Wrap TTS provider
    traced_tts = LiveKitTTSWrapper(
        tts=cartesia.TTS(
            model="sonic-english",
            voice="friendly-voice-id"
        ),
        session_id=session_id
    )
    
    # Create session
    session = AgentSession(
        stt=traced_stt,
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=traced_tts
    )
    
    # Setup tracing with enriched context
    setup_livekit_tracing(session, metadata=job_metadata)
    
    print(f"🍔 Agent connected to room: {ctx.room.name}")
    
    # Start agent
    await session.start(agent=DriveThruAgent(), room=ctx.room)

📋 Prerequisites

pip install noveum-trace livekit livekit-agents
pip install livekit-plugins-deepgram livekit-plugins-cartesia livekit-plugins-openai

Set your environment variables:

export NOVEUM_API_KEY="your-noveum-api-key"
export DEEPGRAM_API_KEY="your-deepgram-api-key"
export CARTESIA_API_KEY="your-cartesia-api-key"
export OPENAI_API_KEY="your-openai-api-key"
export LIVEKIT_URL="your-livekit-url"
export LIVEKIT_API_KEY="your-livekit-api-key"
export LIVEKIT_API_SECRET="your-livekit-api-secret"

🔧 How It Works

1. STT Wrapper

Wraps your speech-to-text provider to trace:

Audio transcriptions: Full text of what was spoken
Audio input duration and format
Processing latency and performance
Confidence scores and accuracy metrics
Provider details (model, language, etc.)

2. TTS Wrapper

Wraps your text-to-speech provider to trace:

Text-to-speech input: Exact text sent for audio generation
Generated audio metadata: Duration, format, and quality details
Audio generation time and latency
Voice and model configuration
Provider details and parameters

3. Session Tracing

setup_livekit_tracing() automatically traces:

Agent lifecycle events
User speech inputs
Agent responses
Tool executions

4. Context Enrichment

extract_job_context() enriches traces with LiveKit session metadata:

Job Information: Job ID, agent name, and dispatch information
Room Details: Room name, SID, and metadata
Participant Info: Participant identity and connection details
Session Context: Timestamps, permissions, and custom metadata

What gets added to your traces:

{
    "livekit.job.id": "job_123",
    "livekit.room.name": "customer-session-456",
    "livekit.room.sid": "RM_abc123",
    "livekit.participant.identity": "user_789",
    "livekit.agent.name": "drive-thru-agent",
    "session.metadata": {...}  # Custom metadata from room
}

Benefits:

Better filtering: Search traces by room, participant, or job ID
Context awareness: Understand which user session each trace belongs to
Debugging: Quickly identify issues for specific rooms or participants
Analytics: Aggregate metrics by room, user, or agent

📊 What You'll See in the Dashboard

After running the agent, check your Noveum dashboard:

Trace View

Complete conversation flow
STT transcriptions with original audio recordings
Agent LLM calls
Tool executions
TTS generations with synthesized audio

Span Details

Audio recordings: Play back actual STT input audio and TTS output audio
Full transcription text and synthesis text
Audio processing times and latency
Transcription accuracy and confidence scores
Response timing and quality metrics
Tool call details

Metrics

Session duration
Turn-by-turn timing
Audio quality metrics
Cost tracking per operation

🔍 Troubleshooting

No audio traces?

Verify STT/TTS wrappers are applied
Check that setup_livekit_tracing() is called
Ensure session is started with wrapped providers

Missing tool executions?

Verify tools are defined with @function_tool
Check that agent has tools in its configuration
Ensure LLM has access to tool definitions

💡 Pro Tips

Use session IDs: Tie traces to user sessions for better context
Monitor latency: Track STT/TTS processing times for optimization
Extract job context: Always use extract_job_context() to enrich traces with room and participant metadata
Add custom metadata: Include user context and business-specific data in trace attributes

🚀 Next Steps

Explore LangChain integration for chaining
Learn about LangGraph agents for complex workflows