Netris Architecture — Overlay vs. Underlay

How the OpenStack OVN overlay and the Netris physical underlay separate cleanly, meeting only at the OVN-BGP agent — plus the three integration layers and the FFO data-flow discipline. A technical reference implementation.

Netris Architecture

Technical Reference Implementation. Target architecture per ADR-007. Reference design, not a deployed system.

The single most important idea in this integration is the clean separation between the software overlay and the physical underlay. OpenStack’s OVN/OVS handles virtual, per-tenant networking in software. Netris handles the physical fabric. They are independent layers with different jobs and different knowledge, and they meet at exactly one point.

Overlay vs. underlay — the separation

OpenStack OVN overlay vs. Netris physical underlay — the two layers meet only at the OVN-BGP agent

  • ① Software overlay — OVN/OVS, managed by Neutron. Tenant VMs, logical switches and routers, security groups, floating IPs. Runs in software on the compute nodes (OVS + ovn-controller). It understands logical/tenant topology — not physical cabling, switch ports, or BGP ASNs.
  • ② Physical underlay — Netris fabric automation. Spine/leaf switches, BGP/EVPN underlay, SoftGate border routing, BlueField DPU hardware isolation, VPC lifecycle. It understands physical topology, ports, and ASNs — not tenant logical networks or security groups.
  • The seam — the OVN-BGP agent. The single touchpoint: a daemon on each compute node that advertises OpenStack tenant prefixes and floating IPs into the Netris-managed BGP underlay.

The rule, stated plainly: OVN never programs a switch; Netris never runs the VM overlay; the OVN-BGP agent is the only handoff. Because the two layers are decoupled this way, the physical fabric can be automated, simulated, and reasoned about without disturbing the running software overlay.

The three integration layers

%%{init: {'theme':'base','themeVariables':{'primaryColor':'#0c332d','primaryTextColor':'#e6edf3','primaryBorderColor':'#2dd4bf','lineColor':'#8b949e','secondaryColor':'#11203a','tertiaryColor':'#161b22','fontSize':'14px'}}}%% graph TD subgraph OVERLAY["Overlay (software, Neutron-managed)"] VM["Tenant VMs"] OVN["OVN / OVS logical networks"] end AGENT["OVN-BGP agent
(the seam)"] subgraph UNDERLAY["Underlay (Netris-managed physical fabric)"] BGP["BGP / EVPN underlay"] FAB["Spine / leaf switches"] DPU["BlueField DPU isolation"] end VM --> OVN --> AGENT AGENT -->|advertises tenant prefixes / floating IPs| BGP BGP --> FAB FAB --- DPU
  1. Underlay fabric (Netris-owned). Netris manages the physical spine/leaf fabric independently of OpenStack, providing routed L3 connectivity between all nodes and BGP sessions on each node uplink.
  2. OpenStack overlay (Neutron/OVN). VM-to-VM traffic continues to use OVN/OVS. The OVN-BGP agent advertises tenant prefixes and floating IPs into the underlay — the overlay never needs to understand physical topology.
  3. GPU bare-metal (Netris-direct). GPU nodes that run no hypervisor bypass the OVN overlay entirely, attaching to Netris-managed fabric VLANs via SR-IOV or passthrough NICs for a zero-overhead data path. Node provisioning calls the Netris API to configure the switch port, VPC membership, and DPU policy.

Data-flow discipline

The platform’s digital twin (FFO) is not a time-series database, an event stream, or a cache of real-time telemetry. What flows where is deliberate:

Data Cadence Destination Rationale
Fabric topology, VPCs, DPU policy, BGP peer configuration, node membership Periodic poll (Scout) FFO Slowly-changing structural state — context an agent needs before acting.
BGP session live state, switch-port live state On-demand at investigation time Agent context only (live MCP query) Must be fresh; would be wrong if read from a stale store.
Link utilisation, packet loss, BGP prefix trends, port-state events Continuous Prometheus / Alertmanager High-frequency metrics and events; these drive alerts, never the twin.
Remediation outcomes, post-mortems After each action FFO (write-back) Institutional memory — what was done and why.

The distinction matters most for BGP: the configuration (who peers with whom, expected ASNs) is structural and lives in FFO; the live session state (established / active / idle, current prefix counts) is queried live at investigation time, because stale state would produce incorrect remediation decisions.

Why the simulation cannot disturb production networking

Because Netris sits at the physical underlay and OVN runs in software on the compute nodes, a simulated Netris fabric models the physical layer only — it does not model or touch OVN. The integration’s data-plane seam (the OVN-BGP agent advertising real tenant traffic) is deliberately bracketed in a reference/simulation context, not exercised. This is what makes it safe to build and validate the fabric-automation artifacts ahead of physical hardware without any risk to a running OpenStack overlay.

Continue to Agents & FFO Integration for the fabric ontology, ingestion pattern, and the autonomous operations loop.