AI Chat Interface

Natural language infrastructure queries via query templates and LLM-powered tool calling with 150+ MCP tools.

The Compass chat interface allows platform operators to query infrastructure state using natural language. It supports two processing paths: fast pattern-matched query templates for common questions, and an LLM-backed path for everything else.

Query Templates

Query templates are pattern-matched against the user’s input before the LLM is invoked. When a template matches, Compass executes the query directly and returns formatted results immediately — no LLM round-trip required. This makes common queries fast and deterministic.

Supported Templates

Pattern	Description	Example
`list clusters`	All Kubernetes clusters from FFO	“list clusters”
`ceph health`	Ceph cluster health summary	“ceph health”
`list users`	Keycloak user listing	“show me all users”
`mcp servers`	Registered MCP servers and their status	“mcp servers”
`kolla services`	OpenStack Kolla service inventory	“list kolla services”
`list nodes`	Cluster node inventory	“list nodes”
`list deployments`	Deployment inventory from FFO	“show deployments”

Template matching is case-insensitive and supports common variations (e.g., “show me clusters” matches the list clusters template). Results are returned as formatted markdown tables.

LLM Path

When no query template matches, the input is forwarded to the LLM for processing.

Model and Endpoint

Compass uses Ollama as the LLM backend, calling the /api/chat endpoint with native tool-calling support.

Model: qwen3.5:35b
Temperature: 0.1 (low temperature for deterministic, factual responses)
Timeout: 120 seconds per request

System Prompt

The system prompt is intentionally minimal — one to two lines maximum. The qwen3.5:35b model produces degraded results with longer system prompts. The prompt identifies the assistant’s role and instructs it to use available tools to answer questions about infrastructure.

Tool Selection

With 150+ tools available across 12 MCP servers, Compass uses a core_prefixes filter to narrow the tool set sent to the LLM on each request. This prevents the model from being overwhelmed by too many tool definitions and keeps the context window manageable.

The LLM receives tool definitions in Ollama’s native format and selects which tools to call based on the user’s question. The API executes the selected tool calls against the appropriate MCP servers and returns results to the LLM for synthesis.

Response Processing

Tool call results are processed through the _format_mcp_result() function, which converts raw MCP responses into readable markdown tables. The LLM then synthesizes the formatted data into a natural language response.

Conversation Flow

graph TD Input[User Input] --> Match{Template Match?} Match -->|match| Direct[Direct Query] --> Result1[Formatted Result] Match -->|no match| LLM[Ollama /api/chat
qwen3.5:35b] LLM -->|tool_calls| MCP[Execute MCP
Tool Calls] MCP -->|results| Synth[LLM Synthesizes
Final Answer] --> Result2[Formatted Response] style Input fill:#2b6cb0,stroke:#4299e1,color:#fff style Match fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style Direct fill:#2c7a7b,stroke:#38b2ac,color:#e2e8f0 style LLM fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style MCP fill:#2d3748,stroke:#4299e1,color:#e2e8f0 style Synth fill:#553c9a,stroke:#805ad5,color:#e2e8f0 style Result1 fill:#276749,stroke:#48bb78,color:#e2e8f0 style Result2 fill:#276749,stroke:#48bb78,color:#e2e8f0

UI Features

Copy Button

Each assistant message includes a copy button that copies the response content to the clipboard. This is useful for extracting table data, entity identifiers, or command output from chat responses.

Message History

The chat maintains conversation history within a session, allowing follow-up questions that reference previous context (e.g., “tell me more about that cluster” after a listing query).

Troubleshooting

Symptom	Likely Cause	Resolution
Chat hangs for 120s then errors	Ollama endpoint unreachable or model not loaded	Verify `LLM_ENDPOINT` is correct; check Ollama pod logs
Template query returns empty results	Upstream MCP server or TypeDB unavailable	Check `/ready` endpoint; verify TypeDB tunnel
LLM ignores available tools	Too many tools in context or system prompt too long	Verify `core_prefixes` filter is active; keep system prompt under 2 lines
Garbled or incoherent responses	Model context overflow	Reduce conversation history length; start a new chat session