Branch Persistence¶
Pipeline runs do real work — generated plans, agentic edits, test runs. When a pod restarts mid-pipeline, that work is on a /tmp working tree that the next pod cannot see. Without intervention, the next attempt would re-run every analyzer call from scratch, which is both expensive (tokens) and non-idempotent for any LLM round.
Branch persistence is the framework's mitigation: every ticket gets a deterministic work-branch on the source remote, and the failure path of every pipeline pushes the working tree to that branch as a [wip] commit before the lifecycle is marked Failed.
Branch naming¶
The work-branch name is derived from the ticket id. It is stable — the same ticket on the same source produces the same branch every time, so a re-run can find the previous attempt's state.
| Form | When it applies | Example |
|---|---|---|
agent-smith/{ticketId} |
One-repo-per-ticket-system deployments (the AAD-DEV pattern, no project disambiguation needed) | agent-smith/18693 |
agent-smith/{platform}/{projectSlug}/{ticketId} |
Multi-platform / multi-project deployments where ticket ids may collide across systems | agent-smith/azurerepos/cloud-development/18693 |
Slug rules for the hierarchical form:
- Lower-cased, non-alphanumeric runs collapsed to
-, leading/trailing-trimmed - Slugs longer than 64 characters are truncated and suffixed with a 7-char SHA-1 hash of the original slug to keep the result deterministic
- A
projectNamethat slugifies to empty (e.g."!!!---???") is rejected at compose time as a configuration error
Composition lives in TicketBranchNamer (AgentSmith.Application.Services) — a static helper. There is no DI registration; builders call it directly.
The resume path¶
When CheckoutSourceCommand runs, it composes the work-branch name from the ticket and asks the source provider to check out that branch. The provider's CheckoutBranch does:
- Look for a local branch with that name. If found, check it out — done.
- Fetch from
origin. - Look for
refs/remotes/origin/{branch}. If found, create a local tracking branch from it and check it out — resume. - Otherwise, create a fresh branch from the current
HEAD(legacy behavior — first attempt for this ticket).
This means: if a previous pipeline run pushed a [wip] commit, the next run picks up exactly where the prior run stopped. No replays of expensive analyzer calls; the agentic loop sees the prior tool output as committed file state.
The persist path¶
PipelineExecutor wraps every pipeline run. When any step returns a failed CommandResult, the executor invokes PersistWorkBranchHandler before calling lifecycle.MarkFailed(). The handler runs in its own try/catch — a persist failure must never mask the original pipeline failure.
The handler:
- Reads the
Repositoryfrom the pipeline context. If absent (the pipeline failed before checkout), recordsUnknownand returns Fail. - Builds a
[wip] agent-smith run {runId}commit message with three trailers (Run-Id,Pipeline,Failed-Step) so the commit is searchable from a log line. - Calls
ISourceProvider.CommitAndPushAsync. - Classifies any thrown exception into a
PersistFailureKindand stamps it ontoContextKeys.PersistFailureKindfor the executor's logging wrapper to route on.
Failure kinds¶
| Kind | Trigger | Operator action |
|---|---|---|
NoChanges |
Working tree was clean (provider returned an empty-commit signal) | Informational — the pipeline failed before producing any file changes; nothing to persist |
AuthDenied |
Push rejected with HTTP 401/403 or "unauthorized" | Check the source-provider PAT/credentials; the pipeline run still has output in logs |
RemoteDivergent |
Push rejected as non-fast-forward |
Two pipeline runs raced on the same ticket; investigate the older branch on the remote and decide which to keep — the framework refuses to force-push |
NetworkBlip |
HttpRequestException during push |
Transient — operator can re-trigger the ticket and the resume path will pick up the local state if the next run lands on the same pod, otherwise the work is lost |
Unknown |
Anything else | Inspect the log for the underlying exception |
Persist failures are logged at Error (or Warning for NetworkBlip) — they show up in the run telemetry next to the original pipeline failure, not in place of it.
What this does not protect¶
- Pipeline failure between commands within a single transactional step: persistence happens at command boundaries. A handler that produces partial in-memory state without writing to disk is unaffected.
- A successful run: persistence runs only on the failure path. Successful runs commit and push as part of
CommitAndPRCommandand do not need the WIP fallback. - First-attempt failures with no checkout: if the pipeline fails before
CheckoutSourceCommand, there is no working tree to persist (Unknownkind, no commit pushed).
Related¶
- Ticket Lifecycle — where persist sits in the InProgress → Failed transition
- Pipeline System — command/handler model and failure path
- Cost Tracking — why preserving partial work matters for token budgets