F3Iai Agent Integration with Vitro

How the Federal Frontier AI Platform manages Vitro OpenStack infrastructure through MCP tools, Scout/Wrangler agents, FFO knowledge graph sync, and air-gapped inference.

F3Iai Agent Integration with Vitro

The Federal Frontier AI Platform (F3Iai) treats Vitro as a managed infrastructure layer. AI agents query, investigate, and remediate OpenStack resources using MCP tool servers. The FFO knowledge graph maintains a live digital twin of the Vitro environment. In air-gapped deployments, inference runs on-premises via vLLM — no external API is reachable.

Kolla OpenStack MCP Server

The Kolla MCP Server provides 10 tools for inspecting and managing OpenStack services running as Docker containers across Vitro hypervisors. It uses SSH-based container inspection via paramiko for real-time health monitoring, log retrieval, and service management.

Tool Description
kolla_list_containers List all Kolla service containers across hypervisors
kolla_inspect_container Inspect container health, CPU, memory usage
kolla_container_logs Retrieve container logs with optional grep filtering
kolla_container_health Fleet-wide health overview across all hypervisors
kolla_restart_container Restart a specific service container
kolla_exec Run diagnostic commands inside a container
kolla_list_services Map Kolla services to their containers
kolla_service_status Service status across all hosts (up/down/healthy)
kolla_check_config Inspect service configuration files
kolla_get_service_logs Retrieve logs by service name (resolves to correct container)

The Kolla MCP Server runs as a Kubernetes deployment in the f3iai namespace and connects to hypervisors via SSH. Hypervisor credentials are stored in a Kubernetes secret.

OpenStack MCP Server

In addition to the Kolla-level tools, the OpenStack MCP Server provides 31 tools for managing OpenStack resources through the OpenStack API (not at the container level):

  • Compute: list/create/delete/reboot/stop/start VMs, get faults, event history
  • Networking: list networks, subnets, routers, floating IPs, security groups, ports
  • Storage: list volumes, volume types, get capacity
  • Images: list images, get image details
  • Identity: list projects, get quotas
  • Infrastructure: hypervisor stats, availability zones, infrastructure summary

The OpenStack MCP Server authenticates to Keystone using service credentials and operates at the API level — it does not SSH into hypervisors.

How Agents Use Vitro Tools

Scout — Discovery and Context

Scout agents poll Vitro infrastructure and write current state to the FFO knowledge graph. The HyperSync CronJobs run every 15 minutes:

Sync Job Source What It Writes to FFO
ceph-sync Ceph MCP Server Pools, OSDs, monitors, capacity, health
openstack-sync OpenStack MCP Server VMs, networks, images, flavors
capi-sync Kubernetes API CAPO clusters, machines, nodes, deployments
keycloak-sync Keycloak MCP Server Principals, roles, role assignments

When an alert fires, the Dispatch Controller queries FFO at dispatch time to inject live context into the agent’s prompt. The agent knows the resource’s current state, relationships, and history before taking a single action.

Wrangler — Investigation and Remediation

When a Vitro-related alert is dispatched, Wrangler uses the Kolla and OpenStack MCP tools to investigate:

  1. Container-level: Query kolla_container_health for fleet-wide status, kolla_inspect_container for specific service health, kolla_get_service_logs for error analysis
  2. API-level: Query openstack_list_vms for VM status, openstack_get_server_faults for error details, openstack_get_hypervisor_stats for capacity
  3. Storage-level: Query ceph_get_cluster_health, ceph_list_osds for storage health, ceph_get_usage_by_pool for capacity

For LOW risk events (disk cleanup, log rotation, non-production pod restarts), Wrangler acts directly — restarting containers via kolla_restart_container, vacuuming journals, or expanding volumes. For HIGH risk events (Ceph OSD failures, Neutron outages, Keystone issues), Wrangler investigates and documents findings without taking action.

FFO Digital Twin Sync

The FFO TypeDB knowledge graph maintains a continuously updated model of the Vitro environment. This is not a static CMDB — it is a live digital twin that reflects current infrastructure state.

What FFO stores for Vitro:

Entity Type Source Examples
cluster CAPO/Kubernetes Workload clusters provisioned on OpenStack
node Kubernetes API Worker and control plane nodes
volume Ceph/OpenStack Cinder volumes, RBD images
network Neutron Provider networks, tenant networks, subnets
principal Keycloak Users, roles, group memberships
finding Trivy/compliance CVEs, STIG violations, CIS benchmark failures
deployment ArgoCD GitOps-managed workloads
image Glance/Harbor VM images, container images

What FFO stores as relationships:

  • This VM runs on this hypervisor
  • This Ceph pool backs this Kubernetes StorageClass
  • This PVC is bound to this Ceph RBD image
  • This user has this role in this realm
  • This cluster has these open findings

Agents read these relationships at dispatch time. When a PVC alert fires, the agent knows which Ceph pool backs the storage, which OSDs serve that pool, and whether the pool has had capacity issues before — without querying each system individually.

Air-Gapped Inference

Production Vitro deployments operate in air-gapped environments where no external API is reachable. The Anthropic Bedrock API is not available. AI agents use on-premises inference:

Classification Inference Backend Model
IL2-IL4 (connected) AWS Bedrock VPC PrivateLink Claude Sonnet/Opus 4.6
IL5 (GovCloud) AWS Bedrock GovCloud Claude Sonnet/Opus 4.6
IL6 (air-gapped) vLLM on VitroAI bare metal Llama 3.1 70B Instruct
Tactical edge Ollama on Ampere ARM64 Llama 3.1 8B

The Agent Harness is model-agnostic. The same harness governs agent behavior regardless of which inference backend is in use. The k8s-vitroai/ deployment overlay patches the dispatch controller to use the local vLLM endpoint instead of Bedrock.

In air-gapped mode:

  • All MCP servers run inside the air gap
  • FFO TypeDB runs inside the air gap
  • vLLM inference runs on local GPU hardware
  • No data leaves the authorization boundary
  • The agent has the same capabilities — only the LLM changes