Skip to content

Phases & Runs

Agent Smith tracks its own development through a structured workflow of phases and runs. This is the meta-workflow — how the project itself is planned, executed, and documented.

The .agentsmith/ Directory

Every project that Agent Smith works on gets an .agentsmith/ directory:

.agentsmith/
├── context.yaml          # Project description + state tracking
├── coding-principles.md  # Detected coding conventions
├── decisions.md          # Why the agent made each decision
├── code-map.yaml         # LLM-generated code map
├── phases/
│   ├── done/             # Completed phase documents
│   ├── active/           # Currently executing (max 1)
│   └── planned/          # Upcoming phases
└── runs/
    ├── r01-fix-login-bug/
    │   ├── plan.md       # Execution plan
    │   └── result.md     # Outcome with cost data
    └── r02-add-search/
        ├── plan.md
        └── result.md

Phases

A phase is a unit of planned work — a feature, refactor, or capability addition. Each phase has its own Markdown document describing the goal, motivation, approach, files to create/modify, and definition of done.

Phase Lifecycle

planned/ → active/ → done/
  • planned/ — documented but not started. Includes requirements, approach, and acceptance criteria.
  • active/ — currently being worked on. Only one phase can be active at a time.
  • done/ — completed. The document stays as historical reference.

Phase Document Structure

# Phase 52: Single Executable Release

## Goal
What we're building and why.

## Motivation
The problem this solves.

## Approach
Technical details of the implementation.

## Files to Create
- list of new files

## Files to Modify
- list of existing files to change

## Definition of Done
- [ ] Checklist of acceptance criteria

Phase Tracking in context.yaml

The state section in context.yaml tracks all phases:

state:
  done:
    p01: "Initial pipeline: fetch ticket, checkout, plan, execute, commit"
    p02: "Retry and resilience: Polly policies, test retry loop"
    # ...
    p52: "Single executable release: binaries for 5 platforms, GitHub Releases"
  active: {}
  planned:
    p23: "Multi-repo support  .agentsmith/phases/planned/p23-multi-repo.md"
    p25: "PR review iteration  .agentsmith/phases/planned/p25-pr-review.md"

Phases are numbered sequentially (p01, p02, ..., p52). The description after the number is a one-line summary. Planned phases link to their full document.

Runs

A run is a single execution of a pipeline against a ticket or task. Each run produces artifacts:

plan.md

The execution plan generated by the AI before writing code. Contains:

  • Analysis of the ticket and relevant code
  • Step-by-step implementation plan
  • Files to modify and why
  • Test strategy

result.md

The outcome of the run. Contains YAML frontmatter with machine-readable data:

---
ticket: "#57  GET /todos returns 500 when database is empty"
project: todo-list
date: 2026-02-24
result: success
branch: fix/57
pr_url: https://github.com/org/repo/pull/42
duration_seconds: 50
cost:
  total_usd: 0.0682
  phases:
    scout:
      model: claude-haiku-4-5-20251001
      input_tokens: 12450
      output_tokens: 890
      turns: 3
      usd: 0.0062
    primary:
      model: claude-sonnet-4-20250514
      input_tokens: 45200
      output_tokens: 8900
      turns: 7
      usd: 0.0620
---

## Summary
What was done and why.

## Changes
Files modified with explanations.

## Decisions
Architectural choices made during execution.

## Test Results
Pass/fail status of the test suite.

Run Numbering

Runs use a global counter (r01, r02, ...) tracked in context.yaml. Each run gets a directory named r{NN}-slug:

runs/
├── r01-fix-login-bug/
├── r02-add-search-endpoint/
├── r03-security-scan-api/
└── r04-fix-null-reference/

The Complete Workflow

When Agent Smith processes a ticket:

  1. Plan — generates plan.md with the approach
  2. Execute — writes code, guided by the plan
  3. Test — runs the test suite
  4. Result — writes result.md with cost and decision data
  5. PR — commits everything, opens PR, includes result.md

The PR reviewer sees not just the code changes but also:

  • What the agent planned to do
  • What it actually did (and why it deviated, if applicable)
  • How much it cost
  • What decisions it made and why

This transparency is the point. When the agent's code breaks six months later, you'll know what it was thinking.