COMPARISON

RunLedger vs Mocks

Mocks are precise for unit tests. RunLedger captures real tool behavior once and replays it in CI.

comparison mocks testing Updated 2026-01-23

Direct Answer

Recommendation Use RunLedger when mock maintenance becomes a bottleneck or misses real tool behavior.

Mocks are excellent for unit-level isolation. RunLedger is better for end-to-end agent flows where tool behavior and ordering matter.

Use RunLedger when	Use mocks when
You want full flow determinism	You want isolated unit behavior
You need recorded real tool shapes	You can maintain small mock fixtures

bash

runledger run ./evals/demo --mode record
runledger run ./evals/demo --mode replay --baseline baselines/demo.json

If you only need isolated unit tests with tiny fixtures, mocks are simpler.

When to use RunLedger instead of snapshot tests for agent CI.

Compare RunLedger with VCR.py-style HTTP recording for agent workflows.

When to use RunLedger instead of golden file tests for agent behavior.

Last updated: 2026-01-23