Maxim

Specializes in multi-step agent testing with agent-specific metrics (PlanQualityMetric, ToolCorrectnessMetric), component-level evaluation, CI/CD integration, and pre-built evaluators from Google, Vertex, and OpenAI.

Visit Maxim →

testing metrics evaluation ci/cd agents

Want to know if Maxim fits your workflow?

Audit My AI Toolkit

Similar Tools in Agent Evaluation

Arize

Enterprise ML observability platform with compliance, drift detection, custom evaluators, and agent metrics for produ...

Galileo

Automated hallucination detection using model-consensus evaluation and agentic evals, designed to identify and preven...