Skip to content

FactualityJudge

FactualityJudge(config?): Judge<FactualityJudgeOptions<any, JsonValue | undefined, HarnessMetadata, Harness<any, JsonValue | undefined, HarnessMetadata> | undefined>>

Creates a factuality judge over normalized harness output.

FactualityJudge() compares input, output, and expected from the current JudgeContext, so the same judge can run against any application harness. Configure the LLM used for grading with judgeHarness on the judge, suite, or matcher options.

FactualityJudgeConfig = {}

Optional judge name and reusable judge harness default.

Judge<FactualityJudgeOptions<any, JsonValue | undefined, HarnessMetadata, Harness<any, JsonValue | undefined, HarnessMetadata> | undefined>>

import { anthropic } from "@ai-sdk/anthropic";
import { aiSdkJudgeHarness } from "@vitest-evals/harness-ai-sdk";
import { describeEval, FactualityJudge } from "vitest-evals";
import { qaHarness } from "./qaHarness";
const judgeHarness = aiSdkJudgeHarness({
model: anthropic("claude-sonnet-4-5"),
temperature: 0,
});
const factualityJudge = FactualityJudge({ judgeHarness });
describeEval("qa agent", {
harness: qaHarness,
judges: [factualityJudge],
}, (it) => {
it("answers a geography question", async ({ run }) => {
await run("What is the capital of France?", {
metadata: {
expected: "Paris is the capital of France.",
},
});
});
});