JudgeContext

Full normalized context passed to every judge.

Example

type RefundOutput = { status: "approved" | "denied" };

const RefundStatusJudge = createJudge<string, RefundOutput>(
  "RefundStatusJudge",
  ({ output }) => ({
    score: output.status === "approved" ? 1 : 0,
  }),
);

Extended by

Type Parameters

TInput

TInput = any

TOutput

TOutput extends JsonValue | undefined = JsonValue | undefined

THarness

THarness extends Harness<TInput, TOutput> | undefined = Harness<TInput, TOutput> | undefined

Properties

harness

harness: THarness

Harness associated with this judge context.

input

input: TInput

Original eval input passed to the harness.

output

output: TOutput

App-facing output returned by the harness.

run

run: HarnessRun<TOutput>

Complete normalized harness run being judged.

runJudge?

optional runJudge?: RunJudge

Runs the configured matcher, judge, or suite judge harness with run-scoped context.

session

session: HarnessRun<TOutput>["session"]

Normalized transcript associated with the harness run.

toolCalls

toolCalls: ToolCall[]

Flattened tool calls observed in the normalized session.