Agent Harness Pattern (ADR-005)

Architecture decision record for the Frontier Agent Harness — the eight-component pattern that governs how autonomous agents are dispatched, governed, and audited across the Federal Frontier Platform.

Agent Harness Pattern (ADR-005)

The Agent Harness is the standardized pattern through which the Federal Frontier Platform dispatches, governs, and audits autonomous agents. Every agent — whether triggered by an alert, a scheduled job, or a human request — runs inside this harness. The harness enforces risk governance, injects context from the FFO knowledge graph, provides the MCP tool surface, captures audit trails, and writes outcomes back to the world model.

ADR-005 defines the eight components that every agent harness implementation must include. The first production implementation is the Dispatch Controller deployed on the Frontier Management Cluster (FMC), validated March 29, 2026.

Decision

Adopt a standardized eight-component harness pattern for all autonomous agent operations. No agent executes outside the harness. The harness is the control plane for agent behavior — it determines what the agent can see, what it can do, and what happens with its output.

Eight Components

graph TD T[1. Trigger Layer] --> RG[2. Risk Governance] RG --> CI[3. Context Injection] CI --> TS[4. Tool Surface] TS --> Agent[Agent Execution
Claude Code in K8s Job] Agent --> OP[5. Output Parsing] OP --> AT[6. Audit Trail] OP --> ER[7. Escalation Routing] OP --> WB[8. World Model Write-Back] style T fill:#c53030,stroke:#fc8181,color:#fff style RG fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style CI fill:#2c7a7b,stroke:#38b2ac,color:#e2e8f0 style TS fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style Agent fill:#2b6cb0,stroke:#4299e1,color:#fff style OP fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style AT fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style ER fill:#d69e2e,stroke:#ecc94b,color:#1a202c style WB fill:#2c7a7b,stroke:#38b2ac,color:#e2e8f0

1. Trigger Layer

The entry point. Accepts events from external systems and translates them into dispatch requests. The trigger layer validates the payload format and extracts the resource identifier, severity, and description.

FMC implementation: FastAPI /dispatch endpoint accepting Grafana Alertmanager webhook payloads.

2. Risk Governance

Every dispatch request is evaluated against policy before an agent is spawned. Risk governance classifies the operation as LOW, MEDIUM, or HIGH based on the resource type, severity, blast radius (number of dependents in the FFO graph), and organizational policy constraints.

FMC implementation: OPA Wasm — in-process Rego policy evaluation via opa-python-client. The Rego policy defines risk classification rules for VitroAI infrastructure resource types (hypervisors, networks, storage pools, compute instances, application deployments). LOW risk operations dispatch autonomously. MEDIUM and HIGH risk operations pause at a human approval gate.

3. Context Injection

Before the agent begins, the harness fetches relevant context from the FFO knowledge graph. This gives the agent environment-specific knowledge that the LLM cannot derive from training data: what is running on this resource, what depends on it, what happened last time this alert fired, what classification level governs this workload.

FMC implementation: Calls ffo.context.for_action via the FFO MCP server. The context includes entity attributes, relationship graph (dependencies, deployments, services), prior incident history, and classification metadata.

4. Tool Surface

The harness determines which MCP tools the agent can access. The tool surface is not hardcoded — it is dynamically generated from the platform’s MCP server registry.

FMC implementation: Queries the mcp_servers and mcp_tools tables in Postgres to generate a dynamic mcp.json configuration. The current FMC deployment provides 13 MCP servers exposing 153+ tools: Grafana (28), OpenStack (23), Gitea (22), Atlassian (20), Keycloak (12), Ceph (12), ArgoCD (11), Kolla (10), FFO (10), Web, Federal Compliance (3), Tool Hub (1), and Trailboss (1). Total verified: 153+ tools.

5. Output Parsing

When the agent completes, the harness captures and parses the output. This includes the remediation outcome (success, failure, partial), evidence collected during investigation, actions taken, and any post-mortem narrative the agent generated.

FMC implementation: K8s Job exit code determines success/failure. Structured output is captured from the Job’s stdout logs.

6. Audit Trail

Every dispatch is recorded with full provenance: the triggering event, risk classification, context injected, prompt rendered, tools called, actions taken, outcome, and duration. The audit trail is immutable and queryable.

FMC implementation: K8s Job metadata and logs provide the execution record. FFO write-back creates a persistent record in the knowledge graph linked to the triggering entity.

7. Escalation Routing

When the agent cannot resolve an issue autonomously — because the risk classification requires human approval, or because the remediation failed, or because the agent encountered a condition outside its competence — the harness routes the escalation to the appropriate human.

FMC implementation: Risk-based gating. MEDIUM and HIGH risk classifications trigger an approval gate before the agent is dispatched. Failed remediations are recorded in FFO and flagged for human review.

8. World Model Write-Back

After every dispatch, the harness writes the outcome back to the FFO knowledge graph. This is how the platform accumulates institutional memory without humans authoring runbooks. The next time the same alert fires on the same resource, the agent’s context injection will include the prior remediation — what was tried, what worked, what the environment looked like at the time.

FMC implementation: Calls ffo.write via the FFO MCP server. Creates change records, findings, and incident relationships linked to the affected entity.

Agent Runtime

The agent runtime is a container image (claude-runner) based on node:20-slim with Claude Code installed. The container’s entrypoint is claude — no wrapper scripts, no shell interpretation. The Dispatch Controller passes the rendered prompt directly as Kubernetes Job args.

Each dispatched agent is an isolated, short-lived Kubernetes Job:

No persistent state — reads context from FFO, acts via MCP, writes back to FFO, terminates
RBAC-scoped — the Job’s service account limits Kubernetes API access
Inference via Bedrock — Claude Sonnet 4.6 or Opus 4.6 via AWS VPC PrivateLink (sovereign, no public internet)
Full tool access — dynamic mcp.json provides the complete MCP tool surface

Inference Backends

The harness supports multiple inference backends selected by classification level:

Classification	Backend	Model	Sovereignty
IL2-IL4 CUI	AWS Bedrock VPC PrivateLink	Claude Sonnet/Opus 4.6	FedRAMP Moderate — no public internet
IL5	Bedrock GovCloud	Claude Sonnet/Opus	FedRAMP High / DoD SRG
IL6 Air-Gapped	vLLM on VitroAI	US-origin models	Agency RMF, no external network
Tactical Edge	Ollama on Ampere ARM64	Compact models	Disconnected operation

Repository

The agent harness implementation lives in the frontier-agent-harness repository (Gitea and GitLab):

frontier-agent-harness/
  claude-runner/          # Agent runtime container (node:20-slim + Claude Code)
  dispatch-controller/    # Dispatch Controller (FastAPI)
    app/
      main.py             # /dispatch endpoint
      dispatch.py         # K8s Job builder
      models.py           # Pydantic models (SREEvent, DispatchResult)
      policy.py           # OPA risk evaluation
      context.py          # FFO context fetch
      prompt.py           # Jinja2 prompt rendering by severity
      registry.py         # Dynamic mcp.json from Postgres
      ffo.py              # FFO write-back
    policies/
      risk_classification.rego  # OPA risk policy
  k8s/                    # Base Kubernetes manifests (SA, RBAC, deployment, service)
  k8s-vitroai/            # VitroAI overlay (Keycloak token exchange, vLLM config)
  token-exchanger/        # Keycloak SA token exchange init container

Status

Milestone	Status	Date
ADR-005 approved	Done	March 2026
Claude-runner image built and tested	Done	March 29, 2026
Dispatch Controller deployed on FMC	Done	March 29, 2026
Bedrock inference validated (Sonnet 4.6, Opus 4.6)	Done	March 29, 2026
OPA risk classification validated	Done	March 29, 2026
End-to-end autonomous dispatch	Done	March 29, 2026
Production alert integration	Planned	—
VitroAI overlay (IL6 air-gapped)	Planned	—

Relationship to TrailbossAI / Posses

The Agent Architecture page describes the TrailbossAI (LangGraph) and Posse (CrewAI) orchestration model — the strategic target architecture for multi-agent coordination with specialized roles (Marshal, Scout, Sage, Wrangler). The Agent Harness is the validated infrastructure layer that makes autonomous dispatch possible. TrailbossAI and Posses will run inside the harness when they are deployed — the harness provides the trigger, governance, context, tools, audit, and write-back; TrailbossAI provides the multi-agent orchestration within the Job.