COMPARISON
RunLedger vs VCR.py-style recording
VCR.py is ideal for HTTP clients. RunLedger handles any tool protocol plus contracts and budgets.
Direct Answer
Recommendation
Use RunLedger when your agent uses multiple tools beyond HTTP and you need CI gates.
VCR.py captures HTTP traffic. RunLedger captures arbitrary tool calls and applies schema + budget gates on replay.
Quick Decision
| Use RunLedger when | Use VCR.py when |
|---|---|
| Your agent calls multiple tools | You only need HTTP recording |
| You need contract + budget gates | You only need request/response playback |
When VCR.py is better
- You have a pure HTTP client in Python.
- You want the simplest recording setup for requests.
- You do not need tool ordering or budget checks.
When RunLedger wins
- You need to replay non-HTTP tools or DB calls.
- You want schema validation and hard budgets.
- You need baseline regression checks in CI.
Tradeoffs
- Requires suite definitions beyond a simple decorator.
- You must integrate the agent protocol with RunLedger.
bash
runledger run ./evals/demo --mode record
runledger run ./evals/demo --mode replay --baseline baselines/demo.json
When NOT to use RunLedger
If your workflow is a standard HTTP client with no tool protocol, VCR.py is sufficient.