ANSWER HUB

RunLedger replay mode

Replay mode reuses cassette entries so CI never calls live tools.

replay ci cassettes Updated 2026-01-26

Direct Answer

Replay mode reuses recorded cassettes so CI never calls live tools. It applies assertions, budgets, and optional baseline gates to fail on regressions.

Quick Decision

Use RunLedger when	Consider alternatives when
You need deterministic CI and fast runs.	You require live data every run.
You want replay to enforce contracts and budgets.	You only want soft monitoring.
You can maintain cassettes over time.	You cannot update fixtures reliably.

Replay command

bash

runledger run ./evals/<suite> --mode replay --baseline baselines/<suite>.json

What can fail

Cassette mismatch when tool calls change.
Assertion failures on output schema or tool usage.
Budget failures on wall time or tool limits.
Baseline regressions on success rate or latency.

Tradeoffs

Replay runs depend on cassette freshness.
Live behavior may drift from recorded outputs.
Changes require re-recording and review.

When NOT to use RunLedger

Skip replay when every run must use live data, or when you cannot store tool outputs.

Next steps

Read the Golden Path Back to Answer Hub

Last updated: 2026-01-26