Noveum.ai Blog

Read the latest news & articles from Noveum.ai (prev MagicAPI Inc).

Evals for AI Agents: What They Are, Why They Matter, and How Noveum.ai Makes Them Practical
#noveum#ai-agents#evaluations#novaeval#tracing#testing#monitoring
Evals for AI Agents: What They Are, Why They Matter, and How Noveum.ai Makes Them Practical

Learn what evals for AI agents are, why they are essential for production AI, and how Noveum.ai makes running evaluations practical without slowing down your development roadmap.

Aditi Upaddhyay

Aditi Upaddhyay

9/25/2025

GPT-OSS vs GPT-5 vs GPT-4o-mini — MMLU Benchmark Comparison (Accuracy, Runtime, Thinking Modes)
#noveum#novaeval#mmlu#benchmark#evaluation#analysis#gpt-oss#gpt-5#gpt-4o-mini#o3#thinking-modes#runtime#accuracy
GPT-OSS vs GPT-5 vs GPT-4o-mini — MMLU Benchmark Comparison (Accuracy, Runtime, Thinking Modes)

MMLU benchmark comparison of GPT-OSS (thinking modes), GPT-5, O3, and GPT-4o-mini focusing on accuracy, runtime efficiency, and practical model selection.

Shivam Gupta

Shivam Gupta

8/13/2025

o1-mini vs gpt-4o-mini — What We Learned from 1,000 MMLU Samples
#noveum#novaeval#mmlu#evaluations#reports#analysis
o1-mini vs gpt-4o-mini — What We Learned from 1,000 MMLU Samples

We compared Azure o1-mini vs gpt-4o-mini on 1,000 MMLU math samples using NovaEval. Here’s how we tested, what worked, what didn’t, and when the 15× cost premium makes sense.

Shashank Agarwal

Shashank Agarwal

8/12/2025

From Development to Production - Inside Noveum.ai's AI Observability Platform
#noveum#ai#observability#tracing#sdks#llm
From Development to Production - Inside Noveum.ai's AI Observability Platform

Discover how Noveum.ai provides comprehensive tracing and observability for AI applications, from development debugging to production optimization.

Shashank Agarwal

Shashank Agarwal

3/3/2025

Noveum.ai - Comprehensive AI Tracing and Observability Platform
#noveum#ai#tracing#observability#llm#rag#agents
Noveum.ai - Comprehensive AI Tracing and Observability Platform

Discover how Noveum.ai provides comprehensive tracing and observability for LLM applications, RAG systems, and multi-agent workflows with our powerful Python and TypeScript SDKs.

Shashank Agarwal

Shashank Agarwal

3/2/2025