Noveum.ai Blog
Read the latest news & articles from Noveum.ai (prev MagicAPI Inc).
Learn what evals for AI agents are, why they are essential for production AI, and how Noveum.ai makes running evaluations practical without slowing down your development roadmap.

Aditi Upaddhyay
9/25/2025
MMLU benchmark comparison of GPT-OSS (thinking modes), GPT-5, O3, and GPT-4o-mini focusing on accuracy, runtime efficiency, and practical model selection.

Shivam Gupta
8/13/2025
We compared Azure o1-mini vs gpt-4o-mini on 1,000 MMLU math samples using NovaEval. Here’s how we tested, what worked, what didn’t, and when the 15× cost premium makes sense.

Shashank Agarwal
8/12/2025
Discover how Noveum.ai provides comprehensive tracing and observability for AI applications, from development debugging to production optimization.

Shashank Agarwal
3/3/2025
Discover how Noveum.ai provides comprehensive tracing and observability for LLM applications, RAG systems, and multi-agent workflows with our powerful Python and TypeScript SDKs.

Shashank Agarwal
3/2/2025



