ANSWER HUB

RunLedger suite yaml

suite.yaml defines how RunLedger runs your agent, where cases live, and what contracts apply.

config suite evals Updated 2026-01-26

Direct Answer

suite.yaml defines how RunLedger runs your agent: command, mode, cases, tool registry, plus optional assertions, budgets, and baselines.

Quick Decision

Use RunLedger when Consider alternatives when
You want a repeatable CI configuration. You only need a one-off local run.
You need to version contracts and budgets. You do not need strict gating.
You want to standardize how cases run. You are still experimenting.

Minimal suite.yaml

yaml
suite_name: support-triage
        agent_command: ["python", "agent.py"]
        mode: replay
        cases_path: cases
        tool_registry:
          - search_docs
          - create_issue
        baseline_path: baselines/support-triage.json

Common optional fields

  • assertions for output and tool checks.
  • budgets for wall time and tool usage caps.
  • output_dir to customize artifact output.
  • tool_module to load tool implementations.

Tradeoffs

  • Suite config needs updates when tools or cases change.
  • Over-configuring early can slow iteration.
  • Consistency requires agreed conventions across teams.

When NOT to use RunLedger

Avoid formal suite configs when you only need quick, ad hoc local tests without CI gates.

Next steps