Documentation

NovaPilot - Intelligent Analysis Orchestrator

Automated AI agent and conversational dataset analysis with specialized AI agents and comprehensive reporting

What is NovaPilot?

NovaPilot is Noveum's internal intelligent orchestrator that automatically analyzes NovaEval evaluation scores. When evaluations score poorly, NovaPilot uses four specialized AI agents to examine the failures, identify root causes, and generate specific suggested fixes for your prompts, tool configurations, and agent flows. All insights and copy-paste ready solutions appear directly in your dashboard—completely automated with no setup required.

How NovaPilot Works: The Flow

📊
1. NovaEval Scores
Evaluations run on your AI traces
🔍
2. Automatic Analysis
NovaPilot identifies low scores
🤖
3. AI Agent Analysis
4 specialized agents analyze issues
4. Suggested Fixes
Code-ready solutions generated
📱
5. Dashboard Display
View insights instantly
📊 Input
NovaEval scores from 73+ scorers
🔍 Detection
Identifies failing evaluations automatically
🤖 Analysis
4 AI agents diagnose root causes
✨ Output
Copy-paste ready fixes

Key Features

🤖 Specialized AI Agents

Four specialized agents analyze different aspects: flow logic, prompts, tools, and general patterns

🎯 Automatic Detection

Automatically detects dataset type (agent vs conversational) and applies appropriate analysis strategies

📊 Streaming Statistics

Memory-efficient analysis of large datasets using streaming algorithms

✅ Reasoning Validation

Validates and extracts meaningful insights from evaluation reasoning

Specialized Analysis Agents

NovaPilot employs four specialized AI agents, each focusing on different aspects of your AI system:

1. Flow Analyzer Agent

  • Analyzes agent execution flow and decision-making patterns
  • Identifies issues with state transitions and control flow
  • Detects loops, dead ends, and inefficient paths

2. Prompt Analyzer Agent

  • Examines prompt quality and effectiveness
  • Identifies ambiguous or problematic prompts
  • Suggests prompt improvements for better performance

3. Tool Analyzer Agent

  • Evaluates tool usage patterns and effectiveness
  • Identifies tool selection issues and misuse patterns
  • Recommends tool configuration improvements

4. General Analyzer Agent

  • Performs comprehensive cross-cutting analysis
  • Identifies systemic issues and patterns
  • Provides holistic recommendations

Example: NovaPilot Analysis Report

Here's what NovaPilot provides after analyzing NovaEval scores for a customer support AI assistant:

Overall Health Assessment

7.2/10
Overall Score
Grade B
Performance Rating
24
Traces Analyzed

⚠️ Critical Finding

Agent exhibits critical failures in maintaining contextual precision and scope boundaries, causing unreliable responses in 15% of interactions. Immediate attention required.

Issue #1: Contextual Precision Loss

CRITICALScore: 3.2/10• Affected: 12 conversations

Problem Detected
Agent loses track of the original user query after follow-up questions, providing responses that drift from the user's actual intent. Context window management is insufficient for multi-turn conversations.

Impact Analysis

  • 15% of all conversations experience context loss
  • Average response quality drops by 40% after 3rd message
  • User satisfaction scores correlate with this issue

Suggested Fix
Add this to your system prompt:

Before each response, re-evaluate the user's main question. If the conversation has multiple turns, 
reference earlier messages to stay on topic. If you're unsure about their intent, ask: 
"Just to confirm, you're asking about [topic] - is that correct?"

Expected Improvement: 3.2 → 8.5 • 85% reduction in context drift

Issue #2: Out-of-Scope Query Handling

HIGHScore: 6.8/10• Frequency: 10% of queries

Problem Detected
Agent attempts to answer questions outside its knowledge domain, leading to hallucinated or incorrect information. Lacks proper boundary enforcement for supported topics.

Specific Examples Analyzed

  • Agent answered general knowledge questions unrelated to product domain
  • Provided speculative information not present in knowledge base
  • Failed to redirect users to appropriate resources for out-of-scope queries

Suggested Fix
Add this to your system prompt:

You can ONLY answer questions about [your product/domain]. For any question outside this scope, 
respond: "I specialize in [product]. I can't help with that, but I'd be happy to assist with 
[your capabilities]. What would you like to know?"

Never answer questions outside your domain, even if you think you know the answer.

Expected Improvement: 6.8 → 9.2 • 95% fewer hallucinations

Priority Recommendations

PRIORITY: CRITICAL

Improve Tool Error Handling

Implement robust error handling for search_knowledge_base tool failures. Add fallback: "I couldn't find information on that. Could you rephrase your query?"

Expected Impact: Prevents abrupt failures, improves user satisfaction

PRIORITY: HIGH

Implement Proactive Clarification

Add rule: "If a user query is ambiguous, always ask for clarification before answering"

Expected Impact: Reduces incorrect assumptions, improves answer accuracy


All of this analysis happens automatically—no setup required. Just run NovaEval and get instant insights in your dashboard!

Next Steps

Exclusive Early Access

Get Early Access to Noveum.ai Platform

Be the first one to get notified when we open Noveum Platform to more users. All users get access to Observability suite for free, early users get free eval jobs and premium support for the first year.

Sign up now. We send access to new batch every week.

Early access members receive premium onboarding support and influence our product roadmap. Limited spots available.

On this page