START HERE

Start here if you're doing agent evals in CI

Follow the Golden Path first, then jump to the answer track that matches your use case.

start-here overview ci Updated 2026-01-23

Direct Answer

If you're running agent evals in CI, start with the Golden Path, then choose the relevant Answer Hub pages based on your tooling and constraints.

Quick Decision

If you need Go to
Record -> Replay -> CI gating Golden Path
Comparison against testing alternatives Comparisons
Setup details and configs Docs

Tradeoffs

  • You will maintain cassettes and baselines alongside code.
  • RunLedger does not replace quality scoring tools.
  • Periodic live checks can complement replayed CI runs.

When NOT to use RunLedger

If your workflows are tool-free or you only need quality scoring, simpler tests may be enough.