StructuredOutputJudge
StructuredOutputJudge() compares the harness output with expected values from
metadata or matcher options. Use it for deterministic checks when the harness
returns a parsed domain object instead of raw provider text.
import { StructuredOutputJudge } from "vitest-evals";
describeEval("capital questions", { harness: structuredQaHarness, judges: [ StructuredOutputJudge(), ],});Explicit Assertion
Section titled “Explicit Assertion”Use explicit assertions when a case needs an extra check or when you want to record a score separately from the suite defaults.
it("records a structured output score", async ({ run }) => { const result = await run("What is the capital of France?", { metadata: { expected: { answer: "Paris", country: "France" }, }, });
await expect(result).toSatisfyJudge(StructuredOutputJudge, { threshold: null, });});threshold: null records the score without turning it into a failure.
Expected Output
Section titled “Expected Output”Put the expected value in run metadata.
const result = await run("What is the capital of France?", { metadata: { expected: { answer: "Paris", country: "France", }, },});Structured output checks work best when the harness returns a domain object instead of raw provider text.
Failure Behavior
Section titled “Failure Behavior”The judge fails when the configured expected fields do not match the harness output and the score falls below the active threshold. Keep harness output as a domain object so reports can show field-level differences instead of raw text comparisons.