Documentation
Getting Started/SDK Integration Guide

SDK Integration Guide

Integrate Noveum.ai tracing into your AI applications with flexible Python approaches

The Noveum.ai Python SDK provides comprehensive tracing and observability for your AI applications with minimal code changes. Whether you're building LLM applications, RAG systems, or multi-agent workflows, our flexible tracing approaches automatically capture essential metrics and traces.

🚀 Quick Start

1. Create Your Account & Get API Key

  1. Sign up at noveum.ai
  2. Generate an API key from the integration page
  3. Get your API key ready for the next step

2. Install the SDK

pip install noveum-trace

Requirements: Python 3.8+

3. Set Environment Variable

Environment Variables:

export NOVEUM_API_KEY="your-api-key"
export NOVEUM_PROJECT="my-ai-app"
export NOVEUM_ENVIRONMENT="development"

Important Notes:

Initialization

  noveum_trace.init(
      api_key=os.getenv("NOVEUM_API_KEY"),
      project=os.getenv("NOVEUM_PROJECT"),
      environment=os.getenv("NOVEUM_ENVIRONMENT"),
  )

When you initialize with noveum_trace.init(), the following happens automatically:

  • Project Creation: The project gets created in the UI automatically based on the string you provide
  • Environment Organization: Environments are used to organize traces (e.g., dev, prod, beta, staging)

🎯 Flexible Tracing Approaches

Context managers provide the most flexible way to trace specific parts of your code without additional requirements.

import os
from openai import OpenAI
import noveum_trace
from noveum_trace.context_managers import trace_llm
 
# Initialize Noveum Trace SDK
noveum_trace.init(
    api_key=os.getenv("NOVEUM_API_KEY"),
    project=os.getenv("NOVEUM_PROJECT"),
    environment=os.getenv("NOVEUM_ENVIRONMENT"),
)
 
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
 
def ask_question(question: str) -> str:
    """Ask a question to the LLM with tracing"""
    
    with trace_llm(model="gpt-4", operation="question_answering") as span:
        messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": question},
        ]
        
        # Manually set input attributes
        span.set_attributes({
            "llm.messages": messages,
            "llm.prompt": question,
        })
        
        response = client.chat.completions.create(
            model="gpt-4",
            messages=messages,
        )
        
        answer = response.choices[0].message.content
        
        # Manually set output attributes
        span.set_attributes({
            "llm.response": answer,
            "llm.completion_tokens": response.usage.completion_tokens,
            "llm.prompt_tokens": response.usage.prompt_tokens,
            "llm.total_tokens": response.usage.total_tokens,
        })
        
        return answer
 
# Usage
answer = ask_question("What is the capital of France?")
# ✅ Automatically tracked: latency, cost, tokens, model performance, model, etc.
# ✅ Manually captured: messages, prompt, response, token usage

Approach 2: Manual Span Creation

For legacy code or when you need fine-grained control, you can manually create and manage spans.

import os
import noveum_trace
from noveum_trace import get_client
 
# Initialize Noveum Trace
noveum_trace.init(
    api_key=os.getenv("NOVEUM_API_KEY"),
    project=os.getenv("NOVEUM_PROJECT"),
    environment=os.getenv("NOVEUM_ENVIRONMENT"),
)
 
def process_data(query: str):
    client = get_client()
    
    # Create a trace if none exists
    trace = None
    if not noveum_trace.core.context.get_current_trace():
        trace = client.start_trace("manual_trace")
    
    # Create span for the operation
    span = client.start_span(
        name="data_processing",
        attributes={
            "function.name": "process_data",
            "function.query": query,
        },
    )
    
    try:
        result = f"Processed: {query.upper()}"
        
        # Add result attributes
        span.set_attributes({
            "function.result": result,
        })
        
        span.set_status("ok")
        return result
        
    except Exception as e:
        span.record_exception(e)
        span.set_status("error", str(e))
        raise
    finally:
        # Always finish the span
        client.finish_span(span)
        
        # Finish the trace if we created one
        if trace:
            client.finish_trace(trace)
 
# Usage
result = process_data("user input")

Approach 3: Mixed Approach

You can combine context managers and manual spans when working with legacy systems.

import os
from openai import OpenAI
import noveum_trace
from noveum_trace.context_managers import trace_llm
from noveum_trace import get_client
 
noveum_trace.init(
    api_key=os.getenv("NOVEUM_API_KEY"),
    project=os.getenv("NOVEUM_PROJECT"),
    environment=os.getenv("NOVEUM_ENVIRONMENT"),
)
 
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
 
def search_with_llm(query: str):
    # Use manual span for legacy database system
    trace_client = get_client()
    db_span = trace_client.start_span(
        name="legacy_db_search",
        attributes={"query": query},
    )
    
    try:
        # Legacy database search
        results = [{"content": "Paris is the capital of France"}]
        db_span.set_status("ok")
    finally:
        trace_client.finish_span(db_span)
    
    # Use context manager for LLM call
    with trace_llm(model="gpt-4", operation="generate_answer") as span:
        messages = [
            {"role": "system", "content": f"Context: {results[0]['content']}"},
            {"role": "user", "content": query}
        ]
        
        # Manually set input attributes
        span.set_attributes({
            "llm.messages": messages,
            "llm.prompt": query,
            "llm.context": results[0]['content'],
        })
        
        response = client.chat.completions.create(
            model="gpt-4",
            messages=messages,
        )
        
        answer = response.choices[0].message.content
        
        # Manually set output attributes
        span.set_attributes({
            "llm.response": answer,
            "llm.completion_tokens": response.usage.completion_tokens,
            "llm.prompt_tokens": response.usage.prompt_tokens,
            "llm.total_tokens": response.usage.total_tokens,
        })
        
        return answer
 
# Usage
answer = search_with_llm("What is the capital of France?")

Approach 4: LangGraph Integration (Complex Agent Workflows)

For LangGraph applications, use the NoveumTraceCallbackHandler to automatically trace complex agent workflows, state management, and conditional routing.

import os
from typing import TypedDict
import noveum_trace
from noveum_trace import NoveumTraceCallbackHandler
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
 
# Initialize Noveum Trace
noveum_trace.init(
    api_key=os.getenv("NOVEUM_API_KEY"),
    project="research-agent",
    environment="development"
)
 
# Define agent state
class AgentState(TypedDict):
    messages: list
    research_complete: bool
 
# Define nodes
def research_node(state: AgentState):
    """Perform research and update state"""
    # Your research logic here
    state["messages"].append(HumanMessage(content="Research completed"))
    state["research_complete"] = True
    return state
 
# Create graph
workflow = StateGraph(AgentState)
workflow.add_node("research", research_node)
workflow.add_edge("research", END)
workflow.set_entry_point("research")
 
# Compile and run with callbacks
app = workflow.compile()
handler = NoveumTraceCallbackHandler()
 
result = app.invoke(
    {"messages": [HumanMessage(content="Research AI")], "research_complete": False},
    config={"callbacks": [handler], "tags": ["langgraph"]}
)
# ✅ Automatically traces: workflow execution, node transitions, state changes, LLM calls

LangGraph-Specific Tracing:

  • Complete workflow execution and structure
  • Node-by-node execution with timing
  • State transitions and data flow
  • Conditional routing decisions
  • Iterative processes and self-loops

Approach 5: LangChain Integration (Chains & Agents)

For LangChain applications, use the NoveumTraceCallbackHandler to automatically trace chains, agents, and retrieval operations.

import os
import noveum_trace
from noveum_trace import NoveumTraceCallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
 
# Initialize Noveum Trace
noveum_trace.init(
    api_key=os.getenv("NOVEUM_API_KEY"),
    project="my-langchain-app",
    environment="development"
)
 
# Create callback handler
handler = NoveumTraceCallbackHandler()
 
# Create chain using LCEL
prompt = ChatPromptTemplate.from_template("Summarize: {text}")
chain = prompt | ChatOpenAI() | StrOutputParser()
 
# Pass callbacks via config
result = chain.invoke(
    {"text": "Your document here"},
    config={"callbacks": [handler]}
)
# ✅ Automatically traces: LLM calls, chains, agents, tools, retrieval

LangChain-Specific Tracing:

  • Chain executions with timing and structure
  • Agent decisions and tool usage with results
  • Retrieval operations (embeddings, vector search)

What Gets Traced in Both LangGraph & LangChain:

Both approaches automatically capture comprehensive LLM metrics:

  • LLM Calls with full context:
    • Model name and provider (e.g., gpt-4, gemini-2.5-flash, claude-3)
    • Input prompts and output responses
    • Token usage (input, output, total tokens)
    • Cost tracking (input cost, output cost, total cost in USD)
    • Latency and performance metrics
    • Model parameters (temperature, max_tokens, etc.)
  • Tool Usage with execution results and timing
  • Error Tracking with detailed stack traces and status messages
  • Custom Attributes and metadata for filtering and analysis

Approach 6: LiveKit Voice Agents (Audio Tracing)

For voice-enabled AI applications, use LiveKit wrappers to automatically trace speech-to-text, text-to-speech, and agent conversations.

import os
import noveum_trace
from noveum_trace.integrations.livekit import (
    LiveKitSTTWrapper,
    LiveKitTTSWrapper,
    setup_livekit_tracing,
    extract_job_context
)
from livekit.agents import Agent, AgentSession, JobContext
from livekit.plugins import deepgram, cartesia, openai
 
# Initialize Noveum Trace
noveum_trace.init(
    api_key=os.getenv("NOVEUM_API_KEY"),
    project="voice-agent",
    environment="production"
)
 
# Define your agent
class VoiceAgent(Agent):
    def __init__(self):
        super().__init__(
            instructions="You are a helpful voice assistant."
        )
 
# Voice agent entrypoint
async def entrypoint(ctx: JobContext):
    session_id = ctx.job.id
    
    # Extract JobContext metadata to enrich traces
    job_metadata = await extract_job_context(ctx)
    
    # Wrap STT provider for speech-to-text tracing
    traced_stt = LiveKitSTTWrapper(
        stt=deepgram.STT(model="nova-2"),
        session_id=session_id
    )
    
    # Wrap TTS provider for text-to-speech tracing
    traced_tts = LiveKitTTSWrapper(
        tts=cartesia.TTS(model="sonic-english"),
        session_id=session_id
    )
    
    # Create session with traced providers
    session = AgentSession(
        stt=traced_stt,
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=traced_tts
    )
    
    # Setup automatic session tracing with enriched metadata
    setup_livekit_tracing(session, metadata=job_metadata)
    
    # Start agent
    await session.start(agent=VoiceAgent(), room=ctx.room)
# ✅ Automatically traces: STT transcriptions, TTS generation, agent LLM calls, audio metrics, JobContext metadata

What Gets Traced:

  • Speech-to-text: Original audio recordings (playable), full transcriptions, confidence scores, latency
  • Text-to-speech: Generated audio files (playable), input text, audio metadata, generation timing
  • Agent LLM interactions and tool usage
  • Audio processing metrics and performance
  • Complete conversation sessions with context
  • JobContext metadata: Room name/SID, participant identity, job ID, agent name (via extract_job_context())

🔧 Framework Integrations

For detailed framework-specific guides with advanced examples:

📊 Advanced Features

Custom Attributes & Events

Add custom metadata to your traces for better filtering and analysis:

from noveum_trace.context_managers import trace_operation
 
def process_user_request(user_id: str, request: str):
    """Add custom attributes and events to traces"""
    
    with trace_operation("user_request") as span:
        # Add custom attributes
        span.set_attributes({
            "user.id": user_id,
            "request.type": "support",
            "request.length": len(request),
        })
        
        # Add events for important milestones
        span.add_event("processing.started", {"user_id": user_id})
        
        # Your business logic here
        result = f"Processed: {request}"
        
        span.add_event("processing.completed", {"success": True})
        
        return result
 
# Usage
result = process_user_request("user_123", "Need help")

Sampling Configuration

# Configure sampling for production environments
noveum_trace.init(
    api_key="your-api-key",
    project="my-app",
    environment="production",
    sampling_rate=0.1,  # Sample 10% of traces by default
    sampling_rules=[
        {"trace_name": "health-check", "rate": 0.01},  # 1% for health checks
        {"trace_name": ".*error.*", "rate": 1.0},      # 100% for errors
        {"trace_name": ".*llm.*", "rate": 0.5},        # 50% for LLM calls
        {"trace_name": ".*rag.*", "rate": 0.2},        # 20% for RAG pipelines
    ]
)
 
# For development, you might want to sample everything
noveum_trace.init(
    api_key="your-api-key",
    project="my-app",
    environment="development",
    sampling_rate=1.0,  # Sample 100% in development
)

Error Handling

Errors must be explicitly recorded in traces for proper observability:

from noveum_trace.context_managers import trace_llm
from openai import OpenAI
import os
 
def process_with_error_handling(prompt: str):
    """Errors are explicitly recorded in traces"""
    
    client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
    
    with trace_llm(model="gpt-4", operation="llm_call") as span:
        try:
            response = client.chat.completions.create(
                model="gpt-4",
                messages=[{"role": "user", "content": prompt}]
            )
            return response.choices[0].message.content
            
        except Exception as e:
            # Exception must be explicitly recorded
            span.record_exception(e)
            span.set_status("error", str(e))
            print(f"Error occurred: {e}")
            raise
 
# Usage
try:
    result = process_with_error_handling("Hello!")
except Exception:
    pass  # Error details available in Noveum dashboard

📈 View Your Data

Once integrated, visit your Noveum Dashboard to:

  • 🔍 Search & Filter traces by any attribute
  • 📊 Analyze Performance trends and bottlenecks
  • 💰 Monitor Costs across different models and providers
  • 🐛 Debug Issues with detailed trace timelines
  • 👥 Collaborate with your team on insights

Next Steps

Exclusive Early Access

Get Early Access to Noveum.ai Platform

Be the first one to get notified when we open Noveum Platform to more users. All users get access to Observability suite for free, early users get free eval jobs and premium support for the first year.

Sign up now. We send access to new batch every week.

Early access members receive premium onboarding support and influence our product roadmap. Limited spots available.