Pipecat Integration Overview

Add automatic tracing to Pipecat voice pipelines with Noveum Trace

Noveum Trace adds automatic tracing to your Pipecat voice pipeline in minutes. Every conversation is recorded as a structured trace with per-turn spans for STT, LLM, and TTS; tool/function-call details are attached to the LLM span as attributes (when available), along with latency and token usage.

Prerequisites

Python 3.11+
A working Pipecat pipeline (pipecat-ai)
A Noveum API key (get one at noveum.ai)

Installation

pip install "noveum-trace[pipecat]"

Quick Start

Integration is two calls on a NoveumPipecatTracer. Your transport, pipeline, and PipelineTask stay stock Pipecat — no wrappers, no class-swaps.

Initialize noveum_trace once at startup.
Create a NoveumPipecatTracer.
Wrap your pipeline with tracer.observe_pipeline(pipeline).
Register handlers with await tracer.register_task_handlers(task, transport=transport).

import asyncio
 
import noveum_trace
from noveum_trace.integrations.pipecat import NoveumPipecatTracer
 
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.pipeline.runner import PipelineRunner
 
# 1) Initialize noveum-trace once at startup
noveum_trace.init(
    api_key="your-noveum-api-key",
    project="my-voice-bot",
)
 
# 2) Create a tracer
tracer = NoveumPipecatTracer(record_audio=True)
 
 
async def main():
    # --- your existing pipeline setup (stock Pipecat) ---
    pipeline = Pipeline([
        transport.input(),
        stt,
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ])
 
    # 3) Wrap the pipeline — auto-inserts an AudioBufferProcessor when needed.
    #    You MUST use the return value.
    pipeline = tracer.observe_pipeline(pipeline)
 
    task = PipelineTask(
        pipeline,
        params=PipelineParams(enable_metrics=True, enable_usage_metrics=True),
    )
 
    # 4) Register handlers — adds the observer, wires turn tracking, taps the
    #    transport for raw audio, and stamps session metadata.
    #    You MUST use the return value.
    task = await tracer.register_task_handlers(task, transport=transport)
 
    runner = PipelineRunner()
    await runner.run(task)
 
asyncio.run(main())

Traces are flushed automatically when the pipeline ends (EndFrame / CancelFrame).

Always use the return value of both observe_pipeline() and register_task_handlers() — each may return a new/modified object. Discarding either return value means the wiring is lost.

Even shorter

When you don't need to set anything on the PipelineTask between wrapping and registration, collapse both calls into one:

task = await tracer.observe_and_create_task(
    pipeline,
    transport=transport,
    params=PipelineParams(enable_metrics=True, enable_usage_metrics=True),
)

What Gets Traced

Each pipeline session produces one conversation trace containing a turn span per conversational exchange. Each turn has child spans for STT, LLM, and TTS. When record_audio=True (the default), observe_pipeline() auto-inserts an AudioBufferProcessor and a full-conversation recording span is created at the root.

Trace: pipecat.conversation
│   pipeline.allow_interruptions, pipeline.sample_rate
│   conversation.total_input_tokens, conversation.total_output_tokens
│   conversation.total_cost, conversation.turn_count
│   conversation.barge_in_rate          (interrupted_turns / total_turns, 0.0–1.0)
│   conversation.interrupted_turn_count (raw count of interrupted turns)
│
├── Span: pipecat.turn  (one per user→bot exchange)
│   ├── turn.number, turn.user_input, turn.duration_seconds
│   ├── turn.user_bot_latency_seconds   (when latency observer is wired)
│   ├── turn.was_interrupted
│   ├── turn.eou_is_complete, turn.eou_confidence  (SmartTurn, when available)
│   ├── turn.latency.user_turn_secs          (VAD+STT+turn-analyzer wait duration)
│   ├── turn.latency.ttfb.<service>_ms       (per-service TTFB: stt, llm, tts)
│   ├── turn.latency.text_aggregation_ms     (LLM token → first TTS sentence delay)
│   │
│   ├── Span: pipecat.stt
│   │   ├── stt.text, stt.is_final, stt.language, stt.user_id
│   │   ├── stt.model, stt.confidence
│   │   ├── stt.vad_to_final_ms, stt.first_text_latency_ms
│   │   ├── stt.interim_results  (JSON list of {text, confidence})
│   │   ├── stt.audio_uuid  (when record_audio=True)
│   │   ├── stt.was_cancelled    (True when interrupted before final transcript)
│   │   ├── stt.partial_transcript  (last interim text received before cancellation)
│   │   ├── stt.interim_count    (number of interim results received)
│   │   └── stt.vad_to_cancel_ms (ms from VAD start to cancellation)
│   │
│   ├── Span: pipecat.llm
│   │   ├── llm.model, llm.system_prompt, llm.temperature, llm.max_tokens
│   │   ├── llm.input  (full message history JSON), llm.output
│   │   ├── llm.input_tokens, llm.output_tokens, llm.total_tokens
│   │   ├── llm.cost.input, llm.cost.output, llm.cost.total, llm.cost.currency
│   │   ├── llm.time_to_first_token_ms
│   │   ├── llm.thoughts[], llm.thought_signatures[]  (thinking-enabled models)
│   │   ├── llm.function_calls[], llm.function_call_results[]  (tool calls)
│   │   └── llm.tools, llm.tool_choice  (when set)
│   │
│   └── Span: pipecat.tts
│       ├── tts.input_text, tts.voice, tts.model
│       ├── tts.time_to_first_byte_ms, tts.characters
│       └── tts.audio_uuid  (when record_audio=True)
│
└── Span: pipecat.full_conversation  (when record_audio=True + AudioBufferProcessor in pipeline)
    └── full_conversation.audio_uuid, full_conversation.audio_format
        full_conversation.audio_channels, full_conversation.duration_ms
        full_conversation.sample_rate
        (stereo WAV: left channel = user, right channel = bot)

Configuration Options

All options are passed to the NoveumPipecatTracer constructor:

tracer = NoveumPipecatTracer(
    record_audio=True,             # per-span STT/TTS audio + full-conversation WAV (default True)
    record_raw_input_audio=True,   # also capture pre-filter mic bytes (default True)
    capture_custom_spans=False,    # fold plain-OTEL spans into the trace (default False)
    auto_enable_metrics=True,      # force PipelineParams metrics flags True (default True)
    capture_errors=True,           # ErrorFrame/FatalErrorFrame → span errors (default True)
    capture_system_logs=False,     # SystemLogFrame → span events (default False, opt-in)
    capture_session_metadata=True, # room URL + transport type on root trace (default True)
 
    # Any NoveumTraceObserver kwarg is also forwarded, e.g.:
    trace_name_prefix="pipecat",   # produces trace named "pipecat.conversation"
    capture_text=True,             # capture LLM input/output and TTS text in spans
    capture_function_calls=True,   # record tool calls on the pipecat.llm span
)

Common tweaks:

Turn off capture_text if you want less text stored in spans.
Set capture_function_calls=False if your LLM never emits function-call frames.
record_audio=True (the default) does three things:
- Uploads per-span audio and adds stt.audio_uuid / tts.audio_uuid attributes.
- Auto-inserts an AudioBufferProcessor inside observe_pipeline() when one isn't already present — no manual wiring needed.
- Captures a full stereo conversation WAV as a pipecat.full_conversation span (left channel = user, right channel = bot).
- Set record_audio=False to disable all audio capture.
record_raw_input_audio=True (the default) taps the transport for pre-filter mic bytes, adding stt.raw_audio_uuid. Pass transport= to register_task_handlers so the tap can be applied; set False to opt out.

The underlying observer is available as tracer.observer for advanced use.

Troubleshooting

No traces appearing

Verify noveum_trace.init() is called before the pipeline starts.
Make sure you assigned the return values: pipeline = tracer.observe_pipeline(pipeline) and task = await tracer.register_task_handlers(task, ...). Discarding either return value means the wiring is lost.
Confirm your API key and the project name match what you configured in the Noveum dashboard.

Turn spans missing or not splitting correctly

Turn tracking is enabled by default in recent Pipecat versions. register_task_handlers also installs a fallback TurnTrackingObserver if you explicitly built the task with enable_turn_tracking=False.

LLM token counts not appearing

With auto_enable_metrics=True (the default), register_task_handlers forces enable_metrics / enable_usage_metrics on the task, so MetricsFrame is emitted without extra setup.
Token counts come from Pipecat's MetricsFrame. Most standard Pipecat LLM services emit them when metrics are enabled.

Function call spans missing

Set capture_function_calls=True.
Confirm your LLM processor emits FunctionCallInProgressFrame / FunctionCallResultFrame.

System prompt (llm.system_prompt) not appearing

This attribute is read from the LLM processor's _settings.system_instruction (or _settings.system_prompt) at the time LLMFullResponseStartFrame fires, then falls back to scanning the message history for a role: "system" entry.
If neither source has the value (e.g. you use a custom or pre-cached LLM processor that does not extend Pipecat's BaseLLMService), the attribute will be absent.
Fix: call trace.set_attributes({"llm.system_prompt": YOUR_SYSTEM_PROMPT}) on the active trace before the first turn, or ensure your custom LLM processor emits LLMContextFrame containing the system role message.

STT spans show only pipecat_span_status: cancelled with no transcript

This is expected for interrupted turns — when the user or bot interrupts before STT finishes, the span is closed with cancellation status.
Starting with SDK v1.6.x, cancelled spans also capture stt.partial_transcript (last interim text), stt.interim_count, and stt.vad_to_cancel_ms so partial speech data is not lost.

Latency breakdown (turn.latency.*) not appearing

These attributes come from Pipecat's UserBotLatencyObserver.on_latency_breakdown event. They require:
1. await tracer.register_task_handlers(task, transport=transport) to be called (wires the latency observer).
2. enable_metrics to be set on PipelineParams — handled automatically by auto_enable_metrics=True (the default); otherwise TTFB fields in the breakdown are empty.
3. SDK v1.6.x or later (earlier versions only captured the total turn.user_bot_latency_seconds).

User audio missing from full_conversation (mono instead of stereo)

If a custom processor between transport input and AudioBufferProcessor consumes InputAudioRawFrame without re-emitting it downstream, AudioBufferProcessor never receives the user audio. The SDK cannot work around this.
Fix: modify your custom processor to re-emit InputAudioRawFrame after processing it internally, so frames continue downstream to AudioBufferProcessor.

Next Steps

Explore a complete example: Basic Pipecat Voice Pipeline