Network Configuration for Multi-Service Access

Making Ollama accessible from Kubernetes pods, remote machines, and other services on the LAN, with firewall, DNS, and TLS considerations.

Developer Workstation Setup — This guide describes running LLM inference on an Apple Silicon Mac for local development, operator tooling, and Compass AI chat. This is not the production inference architecture. For production air-gapped deployments, see the vLLM on Kubernetes Production Inference Guide.

Default Behavior

By default, Ollama binds to localhost:11434. This means only processes on the same machine can reach it. For the Federal Frontier Platform, Kubernetes pods running on texas-dell-04 need to call Ollama running on a Mac at <ollama-host>. This requires explicit network configuration.

Binding to All Interfaces

Set OLLAMA_HOST to 0.0.0.0:11434 to accept connections from any network interface:

For ollama serve:

OLLAMA_HOST=0.0.0.0:11434 ollama serve

For the macOS application:

launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
# Restart the Ollama application after setting this

For a launchctl plist:

<key>EnvironmentVariables</key>
<dict>
    <key>OLLAMA_HOST</key>
    <string>0.0.0.0:11434</string>
</dict>

Verify from the local machine:

curl http://localhost:11434/api/tags

Verify from a remote machine:

curl http://<ollama-host>:11434/api/tags

If the local curl succeeds but the remote curl fails, the issue is the macOS firewall or a network-level firewall.

macOS Firewall Configuration

macOS has a built-in application firewall that can block incoming connections even when OLLAMA_HOST=0.0.0.0.

Check Firewall Status

System Settings > Network > Firewall

If the firewall is enabled, Ollama must be explicitly allowed:

  1. Open System Settings > Network > Firewall > Options.
  2. Look for Ollama in the application list.
  3. If present, ensure it is set to Allow incoming connections.
  4. If not present, click + and add the Ollama application (usually at /Applications/Ollama.app or the binary at /usr/local/bin/ollama).

Alternative: Allow via Command Line

# Check firewall status
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --getglobalstate

# Add Ollama to allowed applications
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/local/bin/ollama
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --unblockapp /usr/local/bin/ollama

Verify Connectivity

From the Kubernetes host or any LAN machine:

# Test TCP connectivity
nc -zv <ollama-host> 11434

# Test API response
curl -s -o /dev/null -w "%{http_code}" http://<ollama-host>:11434/api/tags
# Should return 200

Kubernetes Integration

Direct IP Access

The simplest approach: configure Kubernetes workloads to call Ollama at the Mac’s LAN IP directly.

In the Compass API deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: compass-api
  namespace: f3iai
spec:
  template:
    spec:
      containers:
      - name: compass-api
        env:
        - name: LLM_ENDPOINT
          value: "http://<ollama-host>:11434"
        - name: LLM_MODEL
          value: "qwen3.5:35b-a3b-q4_K_M"

This works when:

  • The Kubernetes nodes can reach <ollama-host> on the LAN.
  • The Mac’s IP is static or reserved via DHCP.
  • No network policies block egress to the LAN from the f3iai namespace.

ExternalName Service

Create a Kubernetes Service that acts as a DNS alias for the Mac’s IP. This lets pods reference Ollama as a service name rather than a hard-coded IP.

apiVersion: v1
kind: Service
metadata:
  name: ollama
  namespace: f3iai
spec:
  type: ExternalName
  externalName: "<ollama-host>"
  ports:
  - port: 11434
    targetPort: 11434
    protocol: TCP

Note: ExternalName services with IP addresses (instead of DNS hostnames) have inconsistent behavior across Kubernetes versions. If this does not work, use an Endpoints resource instead:

apiVersion: v1
kind: Service
metadata:
  name: ollama
  namespace: f3iai
spec:
  ports:
  - port: 11434
    targetPort: 11434
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  name: ollama
  namespace: f3iai
subsets:
- addresses:
  - ip: "<ollama-host>"
  ports:
  - port: 11434
    protocol: TCP

Pods can then reference http://ollama.f3iai.svc.cluster.local:11434 or simply http://ollama:11434 from within the f3iai namespace.

Network Policies

If the f3iai namespace has NetworkPolicy resources that restrict egress, you need to allow traffic to the Mac’s IP:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ollama-egress
  namespace: f3iai
spec:
  podSelector:
    matchLabels:
      app: compass-api
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: <ollama-host>/32
    ports:
    - protocol: TCP
      port: 11434

DNS Considerations

If the Mac’s IP changes, every reference to <ollama-host> breaks. Mitigation options:

  1. DHCP reservation: Configure your router to always assign <ollama-host> to the Mac’s MAC address. This is the recommended approach.

  2. Static IP on the Mac: System Settings > Network > Wi-Fi/Ethernet > Details > TCP/IP > Configure IPv4: Manually. Set the IP, subnet mask, router, and DNS server.

  3. Local DNS entry: If you run a local DNS server (e.g., CoreDNS, Pi-hole, or your router’s DNS), create an A record:

    ollama.local.lan  A  <ollama-host>
    

    Then reference http://ollama.local.lan:11434 in all configurations.

TLS Configuration

Ollama does not support TLS natively. All traffic is plain HTTP. For LAN-only traffic between the Kubernetes cluster and the Mac, this is typically acceptable. If you require TLS:

Caddy Reverse Proxy

Install Caddy on the Mac and configure it as a TLS-terminating reverse proxy:

brew install caddy

Create a Caddyfile:

ollama.local.lan:11435 {
    reverse_proxy localhost:11434
    tls internal
}

This generates a self-signed certificate. For trusted certificates, use a proper domain and ACME provider.

Start Caddy:

caddy run --config /path/to/Caddyfile

Clients then connect to https://ollama.local.lan:11435. Note that Kubernetes pods will need to trust the CA certificate unless you use tls insecure_skip_verify in your client.

nginx Reverse Proxy

server {
    listen 11435 ssl;
    server_name ollama.local.lan;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

    location / {
        proxy_pass http://127.0.0.1:11434;
        proxy_set_header Host $host;
        proxy_read_timeout 120s;  # Tool calling can be slow
        proxy_send_timeout 120s;
    }
}

The proxy_read_timeout of 120 seconds is important — Ollama can take 60-120 seconds to process requests with 150+ tools in the payload.

Monitoring Connectivity

From the Mac (server side)

# Check Ollama is running and what models are loaded
ollama ps

# Check what port Ollama is listening on
lsof -i :11434

# Watch Ollama logs for incoming requests
tail -f /tmp/ollama.err.log

From a Kubernetes pod (client side)

# Exec into a pod in the f3iai namespace
kubectl exec -it -n f3iai deploy/compass-api -- sh

# Test connectivity
curl -s http://<ollama-host>:11434/api/tags

# Test inference
curl -s http://<ollama-host>:11434/api/generate \
  -d '{"model":"qwen3.5:35b-a3b-q4_K_M","prompt":"hello","stream":false}'

Automated Health Check

A simple script to verify Ollama is accessible and a model is loaded:

#!/bin/bash
OLLAMA_HOST="<ollama-host>:11434"

# Check API is reachable
if ! curl -sf "http://${OLLAMA_HOST}/api/tags" > /dev/null 2>&1; then
    echo "CRITICAL: Ollama API unreachable at ${OLLAMA_HOST}"
    exit 2
fi

# Check if the production model is available
MODEL="qwen3.5:35b-a3b-q4_K_M"
if ! curl -sf "http://${OLLAMA_HOST}/api/tags" | python3 -c "
import sys, json
models = json.load(sys.stdin)['models']
names = [m['name'] for m in models]
sys.exit(0 if '${MODEL}' in names else 1)
" 2>/dev/null; then
    echo "WARNING: Model ${MODEL} not found on ${OLLAMA_HOST}"
    exit 1
fi

echo "OK: Ollama healthy, ${MODEL} available"
exit 0