AI Chat Interface
Natural language infrastructure queries via query templates and LLM-powered tool calling with 150+ MCP tools.
The Compass chat interface allows platform operators to query infrastructure state using natural language. It supports two processing paths: fast pattern-matched query templates for common questions, and an LLM-backed path for everything else.
Query Templates
Query templates are pattern-matched against the user’s input before the LLM is invoked. When a template matches, Compass executes the query directly and returns formatted results immediately — no LLM round-trip required. This makes common queries fast and deterministic.
Supported Templates
| Pattern | Description | Example |
|---|---|---|
list clusters |
All Kubernetes clusters from FFO | “list clusters” |
ceph health |
Ceph cluster health summary | “ceph health” |
list users |
Keycloak user listing | “show me all users” |
mcp servers |
Registered MCP servers and their status | “mcp servers” |
kolla services |
OpenStack Kolla service inventory | “list kolla services” |
list nodes |
Cluster node inventory | “list nodes” |
list deployments |
Deployment inventory from FFO | “show deployments” |
Template matching is case-insensitive and supports common variations (e.g., “show me clusters” matches the list clusters template). Results are returned as formatted markdown tables.
LLM Path
When no query template matches, the input is forwarded to the LLM for processing.
Model and Endpoint
Compass uses Ollama as the LLM backend, calling the /api/chat endpoint with native tool-calling support.
- Model:
qwen3.5:35b - Temperature:
0.1(low temperature for deterministic, factual responses) - Timeout: 120 seconds per request
System Prompt
The system prompt is intentionally minimal — one to two lines maximum. The qwen3.5:35b model produces degraded results with longer system prompts. The prompt identifies the assistant’s role and instructs it to use available tools to answer questions about infrastructure.
Tool Selection
With 150+ tools available across 12 MCP servers, Compass uses a core_prefixes filter to narrow the tool set sent to the LLM on each request. This prevents the model from being overwhelmed by too many tool definitions and keeps the context window manageable.
The LLM receives tool definitions in Ollama’s native format and selects which tools to call based on the user’s question. The API executes the selected tool calls against the appropriate MCP servers and returns results to the LLM for synthesis.
Response Processing
Tool call results are processed through the _format_mcp_result() function, which converts raw MCP responses into readable markdown tables. The LLM then synthesizes the formatted data into a natural language response.
Conversation Flow
qwen3.5:35b] LLM -->|tool_calls| MCP[Execute MCP
Tool Calls] MCP -->|results| Synth[LLM Synthesizes
Final Answer] --> Result2[Formatted Response] style Input fill:#2b6cb0,stroke:#4299e1,color:#fff style Match fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style Direct fill:#2c7a7b,stroke:#38b2ac,color:#e2e8f0 style LLM fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style MCP fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style Synth fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style Result1 fill:#276749,stroke:#48bb78,color:#e2e8f0 style Result2 fill:#276749,stroke:#48bb78,color:#e2e8f0
UI Features
Copy Button
Each assistant message includes a copy button that copies the response content to the clipboard. This is useful for extracting table data, entity identifiers, or command output from chat responses.
Message History
The chat maintains conversation history within a session, allowing follow-up questions that reference previous context (e.g., “tell me more about that cluster” after a listing query).
Troubleshooting
| Symptom | Likely Cause | Resolution |
|---|---|---|
| Chat hangs for 120s then errors | Ollama endpoint unreachable or model not loaded | Verify LLM_ENDPOINT is correct; check Ollama pod logs |
| Template query returns empty results | Upstream MCP server or TypeDB unavailable | Check /ready endpoint; verify TypeDB tunnel |
| LLM ignores available tools | Too many tools in context or system prompt too long | Verify core_prefixes filter is active; keep system prompt under 2 lines |
| Garbled or incoherent responses | Model context overflow | Reduce conversation history length; start a new chat session |