NovaPilot - Intelligent Analysis Orchestrator

Automated AI agent and conversational dataset analysis with specialized AI agents and comprehensive reporting

What is NovaPilot?

NovaPilot is Noveum's internal intelligent orchestrator that automatically analyzes NovaEval evaluation scores. When evaluations score poorly, NovaPilot uses four specialized AI agents to examine the failures, identify root causes, and generate specific suggested fixes for your prompts, tool configurations, and agent flows. All insights and copy-paste ready solutions appear directly in your dashboard—completely automated with no setup required.

How NovaPilot Works: The Flow

📊

1. NovaEval Scores

Evaluations run on your AI traces

→

🔍

2. Automatic Analysis

NovaPilot identifies low scores

→

🤖

3. AI Agent Analysis

4 specialized agents analyze issues

→

✨

4. Suggested Fixes

Code-ready solutions generated

→

📱

5. Dashboard Display

View insights instantly

📊 Input

NovaEval scores from 73+ scorers

🔍 Detection

Identifies failing evaluations automatically

🤖 Analysis

4 AI agents diagnose root causes

✨ Output

Copy-paste ready fixes

Key Features

🤖 Specialized AI Agents

Four specialized agents analyze different aspects: flow logic, prompts, tools, and general patterns

🎯 Automatic Detection

Automatically detects dataset type (agent vs conversational) and applies appropriate analysis strategies

📊 Streaming Statistics

Memory-efficient analysis of large datasets using streaming algorithms

✅ Reasoning Validation

Validates and extracts meaningful insights from evaluation reasoning

Specialized Analysis Agents

NovaPilot employs four specialized AI agents, each focusing on different aspects of your AI system:

1. Flow Analyzer Agent

Analyzes agent execution flow and decision-making patterns
Identifies issues with state transitions and control flow
Detects loops, dead ends, and inefficient paths

2. Prompt Analyzer Agent

Examines prompt quality and effectiveness
Identifies ambiguous or problematic prompts
Suggests prompt improvements for better performance

3. Tool Analyzer Agent

Evaluates tool usage patterns and effectiveness
Identifies tool selection issues and misuse patterns
Recommends tool configuration improvements

4. General Analyzer Agent

Performs comprehensive cross-cutting analysis
Identifies systemic issues and patterns
Provides holistic recommendations

Example: NovaPilot Analysis Report

Here's what NovaPilot provides after analyzing NovaEval scores for a customer support AI assistant:

Overall Health Assessment

7.2/10

Overall Score

Grade B

Performance Rating

Traces Analyzed

⚠️ Critical Finding

Agent exhibits critical failures in maintaining contextual precision and scope boundaries, causing unreliable responses in 15% of interactions. Immediate attention required.

Issue #1: Contextual Precision Loss

CRITICALScore: 3.2/10• Affected: 12 conversations

Problem Detected
Agent loses track of the original user query after follow-up questions, providing responses that drift from the user's actual intent. Context window management is insufficient for multi-turn conversations.

Impact Analysis

15% of all conversations experience context loss
Average response quality drops by 40% after 3rd message
User satisfaction scores correlate with this issue

Suggested Fix
Add this to your system prompt:

Before each response, re-evaluate the user's main question. If the conversation has multiple turns, 
reference earlier messages to stay on topic. If you're unsure about their intent, ask: 
"Just to confirm, you're asking about [topic] - is that correct?"

Expected Improvement: 3.2 → 8.5 • 85% reduction in context drift

Issue #2: Out-of-Scope Query Handling

HIGHScore: 6.8/10• Frequency: 10% of queries

Problem Detected
Agent attempts to answer questions outside its knowledge domain, leading to hallucinated or incorrect information. Lacks proper boundary enforcement for supported topics.

Specific Examples Analyzed

Agent answered general knowledge questions unrelated to product domain
Provided speculative information not present in knowledge base
Failed to redirect users to appropriate resources for out-of-scope queries

Suggested Fix
Add this to your system prompt:

You can ONLY answer questions about [your product/domain]. For any question outside this scope, 
respond: "I specialize in [product]. I can't help with that, but I'd be happy to assist with 
[your capabilities]. What would you like to know?"

Never answer questions outside your domain, even if you think you know the answer.

Expected Improvement: 6.8 → 9.2 • 95% fewer hallucinations

Priority Recommendations

PRIORITY: CRITICAL

Improve Tool Error Handling

Implement robust error handling for search_knowledge_base tool failures. Add fallback: "I couldn't find information on that. Could you rephrase your query?"

Expected Impact: Prevents abrupt failures, improves user satisfaction

PRIORITY: HIGH

Implement Proactive Clarification

Add rule: "If a user query is ambiguous, always ask for clarification before answering"

Expected Impact: Reduces incorrect assumptions, improves answer accuracy

All of this analysis happens automatically—no setup required. Just run NovaEval and get instant insights in your dashboard!

Next Steps

NovaEval - Learn about evaluation scorers
Noveum SDK - Programmatic API access
Getting Started - Set up your first project
Dashboard Guide - Visualize your results