Test your voice agent in one line of code

One HTTP call — inline persona, scenario, and endpoint. Get back a transcript, eval scores, a pass/fail verdict, and a recording URL. Phone, LiveKit, and more. No pre-registration required.

Python / pytest

from noveum_trace.testing import VoiceTester

tester = VoiceTester(api_key="...")

def test_cancellation_flow():
    result = tester.call(
        endpoint={"type": "phone", "phoneNumber": "+14155551234", "phoneCountryCode": "+1"},
        persona={"name": "Frustrated customer", "goal": "Cancel subscription"},
        scenario={"name": "Cancellation", "description": "Caller insists on canceling"},
    ).wait()
    assert result.passed, result.eval.summary
    print("Recording:", result.recording_url)

curl

curl -sS -X POST https://api.noveum.ai/api/v1/developer/calls \
  -H "Authorization: Bearer $NOVEUM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "endpoint": {"type":"livekit","livekitUrl":"...","livekitApiKey":"...",
                 "livekitApiSecret":"...","livekitAgentName":"booking-agent"},
    "persona": {"name":"New customer","goal":"Book a table for 4"},
    "scenario": {"name":"Booking","description":"Restaurant reservation"},
    "eval": true
  }'

MCP (Claude / Cursor)

# In Claude / Cursor with the Noveum MCP server configured:
> Test my voice agent at +14155551234 — try to cancel a subscription.

# Claude reads noveum://developer-calls-playbook, runs the call,
# polls for results, and shows the transcript + eval score.

Endpoint types

Phone (any country)Available
LiveKitAvailable
Retell / Vapi / Pipecat / ElevenLabsComing soon
WebSocketComing soon

Pricing

25 credits / minute × up to 10 minutes = 250 credits per call

You're charged for the call duration up to your configured maximum. Batch runs multiply by the number of calls.

How it works

1
Describe the call
Pass an endpoint, a persona (who's calling + their goal), and a scenario.
2
NovaSynth calls your agent
A synthetic caller runs a realistic conversation against your live agent.
3
Get scored results
Transcript, eval score, pass/fail, and a recording URL — poll or await.

FAQ

What types of endpoints are supported?: Phone and LiveKit run today. Retell, Vapi, Pipecat, ElevenLabs Conversational, and raw WebSocket endpoints are accepted by the API and will execute as worker support lands.
How are evals generated?: Each call is scored by Noveum's panel of LLM judges against your scenario, returning a 0–10 score, a pass/fail verdict, a summary, and per-scorer results — plus a recording URL.
Can I use this in CI/CD?: Yes. The Python SDK ships a pytest plugin — assert on result.passed in your test suite and gate deploys on agent quality. Tag calls with serviceVersion to compare across releases.