Sandbox Agent¶

The Sandbox Agent is a small, standalone .NET executable that runs inside each per-pipeline sandbox pod. It pulls Steps from Redis, executes them as shell commands, streams stdout/stderr back, and exits when told to.

It is deliberately boring: no provider knowledge, no pipeline knowledge, no opinion on what languages or toolchains exist. The sandbox agent ships as one carrier image — agent-smith-sandbox-agent — that injects its own binary into any official toolchain image via the init-container pattern.

What it is¶

A worker process inside a sandbox pod
Driven by a Redis wire format (Step / StepEvent / StepResult)
Self-contained .NET 8 single-file binary, no runtime dependencies beyond glibc
One pipeline = one pod = one agent process (run-once, exit on Shutdown)

What it isn't¶

Not a daemon — no multi-job pooling, no queue server
Not toolchain-aware — it runs whatever shell command Step#1 says
Not source-aware — git clone (or any other source acquisition) is just a Step the Server pod composes
Not a pre-baked SDK image — see Why init-container, not pre-baked below

The init-container injection pattern¶

Each sandbox pod has two containers sharing an emptyDir volume:

┌──────────────────── Pod ─────────────────────┐
│                                              │
│  initContainer: agent-smith-sandbox-agent    │
│  ┌────────────────────────────────────────┐  │
│  │  ENTRYPOINT: /agent --inject /shared   │  │
│  │  → cp /agent /shared/agent (exit 0)    │  │
│  └────────────────────────────────────────┘  │
│             │                                │
│             ▼ writes /shared/agent           │
│  ┌────────────── emptyDir /shared ─────────┐ │
│  │  agent (executable, ~80 MB)             │ │
│  └─────────────────────────────────────────┘ │
│             ▲ reads /shared/agent            │
│  ┌────────────────────────────────────────┐  │
│  │  main: mcr.microsoft.com/dotnet/sdk:8  │  │
│  │  command: ['/shared/agent']            │  │
│  │  args: ['--redis-url', '...',          │  │
│  │         '--job-id', 'pipe-42']         │  │
│  │  → JobLoop pulls Steps from Redis      │  │
│  └────────────────────────────────────────┘  │
│                                              │
└──────────────────────────────────────────────┘

Sample Pod spec the Server pod will produce (in p0116):

apiVersion: v1
kind: Pod
metadata:
  name: sandbox-pipe-42
spec:
  restartPolicy: Never
  volumes:
    - name: shared
      emptyDir: {}
    - name: work
      emptyDir: {}
  initContainers:
    - name: inject-agent
      image: holgerleichsenring/agent-smith-sandbox-agent:1.0.0
      volumeMounts:
        - name: shared
          mountPath: /shared
      # default CMD already runs `--inject /shared/agent`
  containers:
    - name: toolchain
      image: mcr.microsoft.com/dotnet/sdk:8.0   # <-- unmodified upstream
      command: ['/shared/agent']
      args:
        - --redis-url
        - redis://redis.agentsmith.svc.cluster.local:6379
        - --job-id
        - pipe-42
      env:
        - name: GIT_TOKEN
          valueFrom:
            secretKeyRef:
              name: agentsmith-secrets
              key: github-token
      volumeMounts:
        - name: shared
          mountPath: /shared
        - name: work
          mountPath: /work
      workingDir: /work

Why init-container, not pre-baked images¶

A pre-baked-image strategy would mean we publish sandbox-dotnet, sandbox-node, sandbox-java, sandbox-salesforce-cli … one image per language we want to support. We rejected that approach because:

Concern	Pre-baked images	Init-container injection
Images we maintain	One per toolchain	Exactly one (the agent)
Toolchain version updates	We rebuild on every release	Operator pulls upstream
Operator-custom toolchains	Operator forks our image	Operator references their own image directly
Adding a new language	Publish a new image	Zero changes from us
Image size per pod	~900 MB (SDK + agent)	~80 MB carrier + upstream image (cached on the node)

The cost is one extra container per pod (the initContainer), which K8s handles in milliseconds.

Redis wire format¶

Three keys per job, all under the sandbox:{jobId}: namespace:

Key	Type	Direction	Purpose
`…:in`	LIST	Server → Agent	Steps to execute
`…:events`	STREAM	Agent → consumers	Soft-batched stdout/stderr lines
`…:results`	LIST	Agent → Server	One StepResult per Step

Step (input)¶

{
  "schemaVersion": 1,
  "stepId": "11111111-1111-1111-1111-111111111111",
  "kind": "run",
  "command": "git",
  "args": ["clone", "https://github.com/foo/bar.git", "."],
  "workingDirectory": "/work",
  "env": null,
  "timeoutSeconds": 600
}

kind is run (default) or shutdown. Run steps require command. The agent inherits its pod's environment, so secrets like GIT_TOKEN are available without putting them in step.env (which would land in Redis and be readable via redis-cli).

StepEvent (output)¶

{
  "schemaVersion": 1,
  "stepId": "11111111-1111-1111-1111-111111111111",
  "kind": "stdout",
  "line": "Cloning into '.'...",
  "timestamp": "2026-05-05T10:00:00.123+00:00"
}

kind is one of started, stdout, stderr, completed. Events are soft-batched (50 lines OR 100 ms, whichever fires first) for efficiency without sacrificing live-progress feel.

StepResult (output)¶

{
  "schemaVersion": 1,
  "stepId": "11111111-1111-1111-1111-111111111111",
  "exitCode": 0,
  "timedOut": false,
  "durationSeconds": 1.23,
  "errorMessage": null
}

Lifecycle¶

boot
 → connect to Redis (5 reconnect attempts, AbortOnConnectFail=false)
 → loop:
     LPOP sandbox:{jobId}:in  (60 s deadline)
       null?  → idle cycle (max 5 then exit 2)
       Shutdown? → exit 0
       Run?   → execute, stream events, push StepResult
 → on SIGINT/SIGTERM: cancel loop, dispose bus, exit
 → on unhandled exception: log to stderr, exit 3

Local debugging¶

Smoke 1 — agent against a local Redis¶

docker run --rm -d -p 6379:6379 redis:7-alpine

dotnet run --project src/AgentSmith.Sandbox.Agent -- \
  --redis-url redis://localhost:6379 \
  --job-id smoke-test \
  --verbose

# in another terminal:
redis-cli LPUSH sandbox:smoke-test:in '{
  "schemaVersion":1,
  "stepId":"00000000-0000-0000-0000-000000000001",
  "kind":"run",
  "command":"echo",
  "args":["hello"],
  "timeoutSeconds":10
}'

redis-cli XRANGE sandbox:smoke-test:events - +
redis-cli LRANGE sandbox:smoke-test:results 0 -1

redis-cli LPUSH sandbox:smoke-test:in '{
  "schemaVersion":1,
  "stepId":"00000000-0000-0000-0000-000000000002",
  "kind":"shutdown"
}'
# agent exits 0

Smoke 2 — init-container injection¶

Mimics the K8s two-container pattern using docker volumes:

docker build -t agent-smith-sandbox-agent:smoke \
  -f src/AgentSmith.Sandbox.Agent/Dockerfile .

docker volume create as-shared

# init-container behavior: copy binary into the shared volume
docker run --rm -v as-shared:/shared agent-smith-sandbox-agent:smoke

# main container behavior: run injected binary inside an UNMODIFIED upstream image
docker run --rm --network=host -v as-shared:/shared \
  mcr.microsoft.com/dotnet/sdk:8.0 \
  /shared/agent --redis-url redis://localhost:6379 \
                --job-id inject-smoke --verbose

If Smoke 2 succeeds against an unmodified upstream image, the init-container pattern works end-to-end without ever touching K8s.

For the Server-side orchestration (ISandbox, KubernetesSandbox, pod lifecycle, RBAC), see Sandbox Architecture.