Quality ScorerLLM-as-Judge

AnswerCompletenessScorer

Evaluates completeness and coverage of AI-generated answers. Assesses whether the answer addresses all aspects of the question and uses available context effectively.

Back to Scorers View Documentation

Overview

Evaluates completeness and coverage of AI-generated answers. Assesses whether the answer addresses all aspects of the question and uses available context effectively.

qualityragconversationalllm-judgetrace-evaluationcoveragethoroughness

Use Cases

RAG-based question answering systems
Conversational AI quality assessment

How It Works

This scorer uses LLM-as-Judge technology to evaluate responses. It prompts a large language model with specific evaluation criteria and the content to assess, then analyzes the LLM's judgment to produce a score and detailed reasoning.

Input Schema

Parameter	Type	Required	Description
output_text	str	Yes	Answer to evaluate for completeness
input_text	str	Yes	Original question
context	dict \| str \| list	No	Available context

Output Schema

Field	Type	Description
score	float	Completeness score (0-10)
passed	bool	True if complete
reasoning	str	Coverage analysis
metadata.covered_aspects	list	Aspects addressed
metadata.missing_aspects	list	Aspects missing

Score Interpretation

Default threshold: 7/10

9-10ExcellentResponse fully meets all evaluation criteria

7-8GoodResponse meets most criteria with minor issues

5-6FairResponse partially meets criteria, needs improvement

3-4PoorResponse has significant issues

0-2FailingResponse fails to meet basic criteria

Related Scorers

Quality

InformationDensityScorer

Evaluates information density and richness of AI-generated responses. Assesses how information-rich ...

Conversational

ConversationCompletenessScorer

Evaluates whether responses fully address all aspects of the user's query within the conversation co...

Quality

ClarityAndCoherenceScorer

Evaluates clarity and coherence of AI-generated responses. Assesses language clarity, structure, log...

Quality

RAGAnswerQualityScorer

Evaluates overall quality of RAG-generated answers. Assesses question-answer alignment, measuring ho...

Frequently Asked Questions

When should I use this scorer?

Use AnswerCompletenessScorer when you need to evaluate quality and rag aspects of your AI outputs. It's particularly useful for rag-based question answering systems.

Why doesn't this scorer need expected output?

This scorer evaluates quality aspects that don't require comparison against a reference answer. It uses the system prompt and context as the implicit ground truth.

Can I customize the threshold?

Yes, the default threshold of 7 can be customized when configuring the scorer.

Quick Info

CategoryQuality

Evaluation TypeLLM-as-Judge

Requires Expected OutputNo

Default Threshold7/10

Ready to try AnswerCompletenessScorer?

Start evaluating your AI agents with Noveum.ai's comprehensive scorer library.

Start Free Trial View Documentation

Explore More Scorers

Discover 106 calibrated LLM-as-Judge scorers for comprehensive AI evaluation

View All Scorers Contact Sales