Maxim
Specializes in multi-step agent testing with agent-specific metrics (PlanQualityMetric, ToolCorrectnessMetric), component-level evaluation, CI/CD integration, and pre-built evaluators from Google, Vertex, and OpenAI.
Visit Maxim →testing metrics evaluation ci/cd agents
Want to know if Maxim fits your workflow?
Audit My AI Toolkit