Unified Chat — Bedrock + MCP Tools + Claude Code Dispatch

The unified chat module powers the AI Assistant in OutpostAI and Compass. It uses AWS Bedrock (Claude) for inference, 13 MCP servers (150+ tools) for agentic tool calling, and Claude Code dispatch for autonomous code generation and deployment.

Unified Chat — Bedrock + MCP Tools + Claude Code Dispatch

The unified chat module (unified_chat.py) is the backend that powers the Federal Frontier AI Assistant — the conversational interface available in both OutpostAI and Compass. It replaced two divergent chat implementations (one vLLM-only, one Bedrock-only) with a single agentic backend that provides:

  • AWS Bedrock inference (Claude Sonnet 4.6 / Opus 4.6) with Ollama fallback for edge
  • 150+ MCP tools from 13 servers, loaded via JSON-RPC
  • Agentic loop — up to 10 iterations of LLM reasoning and tool calling per request
  • Claude Code dispatch — spawn Kubernetes Jobs that write code, create PRs, and deploy via GitOps
  • Runtime model switching — operators select Sonnet, Opus, or Ollama from the UI dropdown

Why It Exists

Before April 2026, OutpostAI and Compass each had their own chat backend:

  OutpostAI (Trailboss API) Compass API
LLM vLLM (Qwen 7B) — hardcoded AWS Bedrock (Claude Sonnet/Opus)
Tools 45 LangGraph tools (hardcoded Python functions) 150+ MCP tools via JSON-RPC
Routing Keyword matching + InfrastructureAgent 3-layer hybrid (intent → posse → agentic)
Model switching Not supported (UI dropdown was ignored) Supported
Code generation Not supported Not supported

The OutpostAI chatbot was broken — vLLM was unavailable, causing every request to return “Connection error.” The model selector dropdown showed “AWS Bedrock — Claude Opus 4.6” but the backend ignored it. The two implementations were diverging with no shared code.

The unified chat module replaces both with a single implementation that both frontends use.

Architecture

graph TD User[Operator] --> FE[OutpostAI / Compass
ChatBot.vue] FE -->|POST /api/v1/chat| API[Trailboss API
FastAPI] API --> UC[Unified Chat Module
unified_chat.py] UC --> LLM{LLM Provider} LLM -->|bedrock-sonnet| BR[AWS Bedrock
Claude Sonnet 4.6] LLM -->|bedrock-opus| BRO[AWS Bedrock
Claude Opus 4.6] LLM -->|dev| OL[Ollama
Qwen 35B] UC -->|tool calls| MCP[13 MCP Servers
150+ tools via JSON-RPC] UC -->|claude_code_dispatch| Job[Claude Code
K8s Job] Job -->|code changes| Gitea[Gitea MR] Gitea -->|merge| Argo[ArgoCD Sync] style User fill:#2b6cb0,stroke:#4299e1,color:#fff style FE fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style API fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style UC fill:#2c7a7b,stroke:#38b2ac,color:#e2e8f0 style LLM fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style BR fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style BRO fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style OL fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style MCP fill:#1a365d,stroke:#4299e1,color:#e2e8f0 style Job fill:#2b6cb0,stroke:#4299e1,color:#fff style Gitea fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style Argo fill:#2d3748,stroke:#4299e1,color:#e2e8f0

Request Flow

  1. Operator types a message in the ChatBot.vue component (shared by OutpostAI and Compass)
  2. Frontend sends POST /api/v1/chat with {message, image, context, conversation_history}
  3. Trailboss API receives the request and delegates to unified_chat.chat()
  4. Unified chat builds the conversation with a system prompt, loads MCP tools, and enters the agentic loop
  5. The LLM (Bedrock or Ollama) reasons about which tools to call
  6. Tool calls are executed against MCP servers via JSON-RPC
  7. Results are fed back to the LLM for the next iteration
  8. After up to 10 iterations, the final response is returned to the frontend

Agentic Loop

The core of the unified chat is a multi-iteration agentic loop. The LLM does not simply answer questions — it reasons about what tools to call, executes them, interprets results, and decides whether to call more tools or respond.

Iteration 1: User asks "What's the Ceph health?"
  → LLM decides to call ceph_get_cluster_health
  → Tool returns: {health: "HEALTH_WARN", checks: [...]}
  → LLM decides more data needed, calls ceph_get_storage_summary

Iteration 2:
  → Tool returns: {total: "24TB", used: "18TB", pools: [...]}
  → LLM has enough data, generates final response with formatted table

Total: 2 iterations, 2 tool calls, 1 response

Configuration

Parameter Bedrock Ollama
Max iterations 10 5
Max tools per iteration 10 5
Temperature 0.1 0.1
Max tokens 4096 2048

Deduplication

The agentic loop tracks every tool call by (tool_name, arguments) hash. If the LLM attempts to call the same tool with the same arguments twice, the cached result is returned instead of re-executing. This prevents infinite loops and redundant API calls.

MCP Tool Loading

On the first chat request, the unified chat module loads tool definitions from all 13 MCP servers via JSON-RPC tools/list calls. Results are cached for subsequent requests.

Server                    → Endpoint (JSON-RPC)                                    → Tools
ffo-mcp-server            → http://ffo-mcp-server.f3iai:50060/jsonrpc              → 7
keycloak-mcp-server       → http://keycloak-mcp-server.f3iai:50057/jsonrpc         → 12
openstack-mcp-server      → http://openstack-mcp-server.f3iai:8080/jsonrpc         → 31
ceph-mcp-server           → http://ceph-mcp-server.f3iai:50054/jsonrpc             → 12
atlassian-mcp-server      → http://atlassian-mcp-server.f3iai:50052/jsonrpc        → 23
grafana-mcp-server        → http://grafana-mcp-server.f3iai:50056/jsonrpc          → 28
argocd-mcp-server         → http://argocd-mcp-server.f3iai:50051/jsonrpc           → 11
gitea-mcp-server          → http://gitea-mcp-server.f3iai:50055/jsonrpc            → 22
kolla-mcp-server          → http://kolla-mcp-server.f3iai:50061/jsonrpc            → 10
k8s-mcp-server            → http://k8s-mcp-server.f3iai:50062/jsonrpc              → 20
web-mcp-server            → http://web-mcp-server.f3iai:50063/jsonrpc              → 4
federal-compliance-mcp    → http://federal-compliance-mcp-server.f3iai:50064/jsonrpc → 3
trailboss-mcp-server      → http://trailboss-mcp-server.f3iai:50059/jsonrpc        → 1

Tool names are prefixed with the server name to avoid collisions (e.g., keycloak_list_users, openstack_list_vms). The MCP tool format is converted to OpenAI function-calling format, then to Bedrock toolSpec format for the converse API.

Each MCP server URL is configurable via environment variables (e.g., KEYCLOAK_MCP_URL, CEPH_MCP_URL) to support both in-cluster service DNS and local development with port-forwarding.

Tool Categories

Infrastructure Query (Read)

Category Example Tools What They Do
Compute openstack_list_vms, openstack_get_vm List and inspect VMs, flavors, images
Kubernetes k8s_list_nodes, k8s_list_pods, k8s_list_clusters Nodes, pods, CAPI clusters and machines
Storage ceph_get_cluster_health, ceph_list_pools Ceph health, OSDs, pools, capacity
Networking openstack_list_networks, openstack_list_security_groups Networks, subnets, routers, floating IPs
Identity keycloak_list_users, keycloak_get_user Users, roles, groups, sessions, realms
Monitoring grafana_query_prometheus, grafana_get_firing_alerts PromQL, LogQL, dashboards, alert rules
GitOps argocd_list_applications, gitea_list_repos ArgoCD apps, Gitea repos, branches, commits
Project Mgmt atlassian_jira_search, atlassian_confluence_get_page Jira issues, Confluence pages
Containers kolla_list_containers, kolla_container_health Kolla OpenStack containers across hypervisors
Knowledge Graph ffo_query, ffo_entity_get TypeQL queries against the FFO digital twin

Infrastructure Action (Write)

Category Example Tools What They Do
Compute openstack_server_create, openstack_server_reboot Create, delete, stop, start, rebuild VMs
Kubernetes k8s_cordon_node, k8s_drain_node, k8s_restart_deployment Node maintenance, pod deletion, rollouts
CAPI k8s_scale_machine_deployment, k8s_delete_machine Scale worker pools, replace machines
Monitoring grafana_create_alert_rule, grafana_create_dashboard Create alerts, dashboards, annotations
GitOps argocd_sync_application, argocd_rollback_application Trigger syncs, rollback deployments
Project Mgmt atlassian_jira_create_issue, atlassian_confluence_create_page Create issues, update docs
Containers kolla_restart_container, kolla_exec Restart services, run diagnostics
Identity keycloak_set_user_enabled Enable/disable users

Claude Code Dispatch (Code + Deploy)

The claude_code_dispatch tool is unique — it spawns a Claude Code agent as a Kubernetes Job that can write code, create merge requests, and deploy changes through the GitOps pipeline.

Claude Code Dispatch

When an operator asks the chatbot to write code, fix a bug, or deploy a change, the LLM calls claude_code_dispatch with:

Parameter Required Description
task Yes What the agent should do — be specific about repo, files, expected outcome
repo No Gitea repository in owner/repo format (e.g., admin/federal-frontier-platform)
severity No Risk level: low (auto-merge), medium (MR for review), high (investigate only)
branch No Target branch (defaults to main)

Severity Levels

Level Agent Behavior Human Review
LOW Makes changes, commits, pushes directly Post-mortem only
MEDIUM Creates a new branch, makes changes, opens a Gitea merge request Operator reviews and merges the MR
HIGH Investigates only, documents findings Operator reads findings, decides next steps

How It Works

  1. The unified chat module receives claude_code_dispatch from the LLM’s tool call
  2. A unique session ID is generated (cc-<12-hex-chars>)
  3. A Kubernetes Job manifest is built with the task prompt, repo, severity, and branch
  4. The Job is created in the f3iai namespace via the K8s MCP server
  5. The Claude Code agent runs with access to the full MCP tool surface
  6. For MEDIUM severity: agent creates a branch, commits changes, opens a Gitea MR
  7. The MR flows through the normal GitOps pipeline — human reviews, merges, ArgoCD syncs

Example Interaction

Operator: "Fix the CORS header bug in the openstack-mcp-server and open a PR"

LLM calls: claude_code_dispatch({
  task: "Fix CORS header handling in the OpenStack MCP server. The server
        returns 'Access-Control-Allow-Origin: *' for preflight requests but
        not for actual requests. Fix the middleware in src/main.py.",
  repo: "admin/federal-frontier-platform",
  severity: "medium"
})

Response: "Claude Code agent dispatched as K8s Job 'claude-code-cc-a3f8b1c2d4e5'.
The agent will:
1. Clone the repo
2. Fix the CORS middleware in openstack-mcp-server/src/main.py
3. Create a Gitea merge request for your review

Track progress in the OutpostAI Dispatches panel."

LLM Configuration

Runtime Model Switching

The chatbot UI includes a model selector dropdown. When the operator changes it, the frontend calls:

PUT /api/v1/llm/config?preset=bedrock-opus

This switches the active LLM at runtime — no restart required. Available presets:

Preset Provider Model Use Case
bedrock-sonnet AWS Bedrock us.anthropic.claude-sonnet-4-6 Default — fast, capable, cost-effective
bedrock-opus AWS Bedrock us.anthropic.claude-opus-4-6-v1 Complex reasoning, large tool sets
dev Ollama qwen3.5:35b-a3b-q4_K_M Edge / air-gapped / development

Bedrock Integration

The unified chat uses the AWS Bedrock converse API (not invoke_model). This provides:

  • Native tool calling — Bedrock handles tool use/result cycling natively
  • Cross-region inference profiles — model IDs prefixed with us. for cross-region routing
  • Automatic message formatting — OpenAI-format messages are converted to Bedrock’s alternating user/assistant format with toolUse and toolResult blocks

Credentials are provided via the aws-bedrock-credentials Kubernetes secret:

env:
  - name: AWS_ACCESS_KEY_ID
    valueFrom:
      secretKeyRef:
        name: aws-bedrock-credentials
        key: aws-access-key-id
  - name: AWS_SECRET_ACCESS_KEY
    valueFrom:
      secretKeyRef:
        name: aws-bedrock-credentials
        key: aws-secret-access-key

Sovereign Inference

For IL6 and tactical edge deployments where AWS Bedrock is unavailable, the dev preset routes to a local Ollama instance running Qwen 35B. The agentic loop works the same way — only the LLM endpoint changes. See Sovereign Inference for details on IL2-IL6 inference backends.

System Prompt

The unified chat includes a system prompt that instructs the LLM to:

  1. Always use tools — never suggest CLI commands or code blocks for the user to run
  2. Select the right tool — a categorized guide maps query types to specific tools
  3. Use Claude Code dispatch for code/deploy tasks — don’t try to generate code inline
  4. Confirm destructive operations — delete, restart, drain require operator confirmation
  5. Use real data — never fabricate infrastructure state

The system prompt is ~30 lines and focuses on tool selection guidance rather than personality.

Deployment

Components

Component Image Deployment Namespace
Trailboss API (includes unified chat) harbor.vitro.lan/ffp/trailboss:v5.5.0 frontier-cluster-api f3iai
OutpostAI Frontend harbor.vitro.lan/ffp/outpostai-frontend:v5.1.12 outpostai-frontend f3iai

Environment Variables

Variable Default Description
LLM_PRESET bedrock-sonnet Active LLM preset
AWS_REGION us-east-1 AWS region for Bedrock
AWS_ACCESS_KEY_ID (from secret) Bedrock credentials
AWS_SECRET_ACCESS_KEY (from secret) Bedrock credentials
FFO_MCP_URL http://ffo-mcp-server.f3iai:50060 FFO MCP server URL
KEYCLOAK_MCP_URL http://keycloak-mcp-server.f3iai:50057 Keycloak MCP URL
OPENSTACK_MCP_URL http://openstack-mcp-server.f3iai:8080 OpenStack MCP URL
(etc. for each MCP server)    

Build and Deploy

# On build server (texas-dell-04):
cd ~/federal-frontier-platform
git pull
docker build -t harbor.vitro.lan/ffp/trailboss:v5.5.0 .
docker push harbor.vitro.lan/ffp/trailboss:v5.5.0

# Update deployment:
kubectl -n f3iai set image deployment/frontier-cluster-api \
  frontier-cluster-api=harbor.vitro.lan/ffp/trailboss:v5.5.0

File Structure

common/api/
  unified_chat.py       # Bedrock client, MCP tools, agentic loop, Claude Code dispatch
  trailboss_api.py      # FastAPI app — /api/v1/chat delegates to unified_chat.chat()

frontend/src/components/
  ChatBot.vue            # Shared chat UI component (model selector, message history, send)

The unified_chat.py module is ~600 lines and has no dependencies beyond httpx, boto3, and pydantic — all included in the Trailboss image.

Validated April 3, 2026

The unified chat was validated with real queries against live infrastructure:

  • “What MCP tools are available?” — returned formatted tables of all 150+ tools across 11 categories
  • “Hello, what can you do?” — returned capability overview with tool-backed examples
  • Model switching — runtime switch from Sonnet to Opus and back via UI dropdown
  • MCP tool calling — agentic loop successfully queried Ceph, OpenStack, Keycloak, and Kubernetes
  • Claude Code dispatch — tool definition present and callable by the LLM
  • End-to-end — OutpostAI at outpostai.vitro.lan serving Bedrock-powered responses through frontier-cluster-apiunified_chat.chat()