Network Configuration for Multi-Service Access
Making Ollama accessible from Kubernetes pods, remote machines, and other services on the LAN, with firewall, DNS, and TLS considerations.
Developer Workstation Setup — This guide describes running LLM inference on an Apple Silicon Mac for local development, operator tooling, and Compass AI chat. This is not the production inference architecture. For production air-gapped deployments, see the vLLM on Kubernetes Production Inference Guide.
Default Behavior
By default, Ollama binds to localhost:11434. This means only processes on the same machine can reach it. For the Federal Frontier Platform, Kubernetes pods running on texas-dell-04 need to call Ollama running on a Mac at <ollama-host>. This requires explicit network configuration.
Binding to All Interfaces
Set OLLAMA_HOST to 0.0.0.0:11434 to accept connections from any network interface:
For ollama serve:
OLLAMA_HOST=0.0.0.0:11434 ollama serve
For the macOS application:
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
# Restart the Ollama application after setting this
For a launchctl plist:
<key>EnvironmentVariables</key>
<dict>
<key>OLLAMA_HOST</key>
<string>0.0.0.0:11434</string>
</dict>
Verify from the local machine:
curl http://localhost:11434/api/tags
Verify from a remote machine:
curl http://<ollama-host>:11434/api/tags
If the local curl succeeds but the remote curl fails, the issue is the macOS firewall or a network-level firewall.
macOS Firewall Configuration
macOS has a built-in application firewall that can block incoming connections even when OLLAMA_HOST=0.0.0.0.
Check Firewall Status
System Settings > Network > Firewall
If the firewall is enabled, Ollama must be explicitly allowed:
- Open System Settings > Network > Firewall > Options.
- Look for Ollama in the application list.
- If present, ensure it is set to Allow incoming connections.
- If not present, click + and add the Ollama application (usually at
/Applications/Ollama.appor the binary at/usr/local/bin/ollama).
Alternative: Allow via Command Line
# Check firewall status
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --getglobalstate
# Add Ollama to allowed applications
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/local/bin/ollama
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --unblockapp /usr/local/bin/ollama
Verify Connectivity
From the Kubernetes host or any LAN machine:
# Test TCP connectivity
nc -zv <ollama-host> 11434
# Test API response
curl -s -o /dev/null -w "%{http_code}" http://<ollama-host>:11434/api/tags
# Should return 200
Kubernetes Integration
Direct IP Access
The simplest approach: configure Kubernetes workloads to call Ollama at the Mac’s LAN IP directly.
In the Compass API deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: compass-api
namespace: f3iai
spec:
template:
spec:
containers:
- name: compass-api
env:
- name: LLM_ENDPOINT
value: "http://<ollama-host>:11434"
- name: LLM_MODEL
value: "qwen3.5:35b-a3b-q4_K_M"
This works when:
- The Kubernetes nodes can reach
<ollama-host>on the LAN. - The Mac’s IP is static or reserved via DHCP.
- No network policies block egress to the LAN from the
f3iainamespace.
ExternalName Service
Create a Kubernetes Service that acts as a DNS alias for the Mac’s IP. This lets pods reference Ollama as a service name rather than a hard-coded IP.
apiVersion: v1
kind: Service
metadata:
name: ollama
namespace: f3iai
spec:
type: ExternalName
externalName: "<ollama-host>"
ports:
- port: 11434
targetPort: 11434
protocol: TCP
Note: ExternalName services with IP addresses (instead of DNS hostnames) have inconsistent behavior across Kubernetes versions. If this does not work, use an Endpoints resource instead:
apiVersion: v1
kind: Service
metadata:
name: ollama
namespace: f3iai
spec:
ports:
- port: 11434
targetPort: 11434
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
name: ollama
namespace: f3iai
subsets:
- addresses:
- ip: "<ollama-host>"
ports:
- port: 11434
protocol: TCP
Pods can then reference http://ollama.f3iai.svc.cluster.local:11434 or simply http://ollama:11434 from within the f3iai namespace.
Network Policies
If the f3iai namespace has NetworkPolicy resources that restrict egress, you need to allow traffic to the Mac’s IP:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ollama-egress
namespace: f3iai
spec:
podSelector:
matchLabels:
app: compass-api
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: <ollama-host>/32
ports:
- protocol: TCP
port: 11434
DNS Considerations
If the Mac’s IP changes, every reference to <ollama-host> breaks. Mitigation options:
-
DHCP reservation: Configure your router to always assign
<ollama-host>to the Mac’s MAC address. This is the recommended approach. -
Static IP on the Mac: System Settings > Network > Wi-Fi/Ethernet > Details > TCP/IP > Configure IPv4: Manually. Set the IP, subnet mask, router, and DNS server.
-
Local DNS entry: If you run a local DNS server (e.g., CoreDNS, Pi-hole, or your router’s DNS), create an A record:
ollama.local.lan A <ollama-host>Then reference
http://ollama.local.lan:11434in all configurations.
TLS Configuration
Ollama does not support TLS natively. All traffic is plain HTTP. For LAN-only traffic between the Kubernetes cluster and the Mac, this is typically acceptable. If you require TLS:
Caddy Reverse Proxy
Install Caddy on the Mac and configure it as a TLS-terminating reverse proxy:
brew install caddy
Create a Caddyfile:
ollama.local.lan:11435 {
reverse_proxy localhost:11434
tls internal
}
This generates a self-signed certificate. For trusted certificates, use a proper domain and ACME provider.
Start Caddy:
caddy run --config /path/to/Caddyfile
Clients then connect to https://ollama.local.lan:11435. Note that Kubernetes pods will need to trust the CA certificate unless you use tls insecure_skip_verify in your client.
nginx Reverse Proxy
server {
listen 11435 ssl;
server_name ollama.local.lan;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://127.0.0.1:11434;
proxy_set_header Host $host;
proxy_read_timeout 120s; # Tool calling can be slow
proxy_send_timeout 120s;
}
}
The proxy_read_timeout of 120 seconds is important — Ollama can take 60-120 seconds to process requests with 150+ tools in the payload.
Monitoring Connectivity
From the Mac (server side)
# Check Ollama is running and what models are loaded
ollama ps
# Check what port Ollama is listening on
lsof -i :11434
# Watch Ollama logs for incoming requests
tail -f /tmp/ollama.err.log
From a Kubernetes pod (client side)
# Exec into a pod in the f3iai namespace
kubectl exec -it -n f3iai deploy/compass-api -- sh
# Test connectivity
curl -s http://<ollama-host>:11434/api/tags
# Test inference
curl -s http://<ollama-host>:11434/api/generate \
-d '{"model":"qwen3.5:35b-a3b-q4_K_M","prompt":"hello","stream":false}'
Automated Health Check
A simple script to verify Ollama is accessible and a model is loaded:
#!/bin/bash
OLLAMA_HOST="<ollama-host>:11434"
# Check API is reachable
if ! curl -sf "http://${OLLAMA_HOST}/api/tags" > /dev/null 2>&1; then
echo "CRITICAL: Ollama API unreachable at ${OLLAMA_HOST}"
exit 2
fi
# Check if the production model is available
MODEL="qwen3.5:35b-a3b-q4_K_M"
if ! curl -sf "http://${OLLAMA_HOST}/api/tags" | python3 -c "
import sys, json
models = json.load(sys.stdin)['models']
names = [m['name'] for m in models]
sys.exit(0 if '${MODEL}' in names else 1)
" 2>/dev/null; then
echo "WARNING: Model ${MODEL} not found on ${OLLAMA_HOST}"
exit 1
fi
echo "OK: Ollama healthy, ${MODEL} available"
exit 0