Agent Skill

Install the vitest-evals agent skill when you want Claude Code, Cursor, Codex, or another coding agent to author, review, or debug harness-backed eval suites in your project.

The skill gives the agent focused context about suite authoring, harness selection, judges, tool replay, and verification — so it writes idiomatic vitest-evals code without guessing at the API.

npx @sentry/dotagents add getsentry/vitest-evals

npx skills add getsentry/vitest-evals

What the Skill Teaches Agents

Choose the right harness for the app runtime.
Use describeEval(...) with one harness per suite.
Call run(...) explicitly inside each eval.
Assert on result.output and use judges for scoring.
Configure tool replay without hiding live model behavior.
Run focused verification commands after edits.

What Is Included

The skill installs as a single directory with:

SKILL.md — activation rules, runtime defaults, and a reference router that points the agent at the right guide for the current task.
Focused references — suite authoring, AI SDK harness, Pi harness, custom harnesses, judges and assertions, tool replay, and troubleshooting.
SPEC.md and SOURCES.md — maintenance notes and source provenance for future skill updates.

HarnessesCompare the supported runtime adapters before choosing one.JudgesUse built-in judges, write custom judges, and set thresholds.Tool ReplayRecord deterministic tool calls without hiding model behavior.GitHub ReportingPublish eval summaries and checks from workflow JSON output.

Agent Skill

Install

What the Skill Teaches Agents

What Is Included

Next