Skip to content

ToolCallJudge

ToolCallJudge(config?): Judge<ToolCallJudgeOptions>

Creates a deterministic judge that checks expected tool calls.

ToolCallJudgeConfig = {}

Matching behavior shared by every assessment from this judge.

Judge<ToolCallJudgeOptions>

describeEval("refund agent", {
harness: refundHarness,
judges: [ToolCallJudge({ ordered: true })],
}, (it) => {
it("creates a refund after lookup", async ({ run }) => {
await run("Refund invoice inv_123", {
metadata: {
expectedTools: ["lookupInvoice", "createRefund"],
},
});
});
});