ToolCallJudge
ToolCallJudge() reads normalized tool calls from the harness run. Use it when
the sequence of tools is part of the behavior you want to preserve.
import { ToolCallJudge } from "vitest-evals";
describeEval("capital questions", { harness: qaHarness, judges: [ ToolCallJudge({ ordered: true, }), ],});Arguments
Section titled “Arguments”Use matcher options when a tool argument is the important assertion.
ToolCallJudge({ tools: [ { name: "lookupCapital", arguments: { country: "France", }, }, ],});Expected Tools
Section titled “Expected Tools”Pass expected tool names through metadata on each run.
const result = await run("What is the capital of France?", { metadata: { expectedTools: ["lookupCapital"], },});Failure Behavior
Section titled “Failure Behavior”The judge fails when a required tool is missing, an unexpected order appears in ordered mode, or configured arguments do not match the captured call. Use ordered checks only when the call order matters; leave ordering loose when multiple tool sequences are equivalent for the behavior under test.