Hey,
I managed to succesfully advertise my OVN networks over BGP but traffic is not routed from the host onto the OVN network.
<removed old information, new information in the first reply>
Hey,
I managed to succesfully advertise my OVN networks over BGP but traffic is not routed from the host onto the OVN network.
<removed old information, new information in the first reply>
Some information about my setup for nmezhenskyi
I run a 3-node cluster LXD 6.4 that share a physical layer 2 network.
This L2 network is 192.168.3.0/24, has a bgp capable router on 192.168.3.1 and the three nodes run
192.168.3.61192.168.3.62192.168.3.63The goal is to create OVN networks with a /24 private IPv4 (Somewhere in 172.16.0.0/12) and a /64 public IPv6. The BGP capable router will be handling NAT and simply route the public IPv6 /64. LXD will not do any NAT at all.
We have Canonical k8s with Cilium/BGP running the exact same setup
fully routed dual stack, no NAT.
I have tried both physical and managed bridge type uplinks for the OVN networks. My preference is physical as it allows me to also use the uplink for my host. The use of physical uplinks is also assumed by the OVN setup guide.
The nodes have 2 physical links, one to be used for guest uplink and the other to be used for management, lxd and ceph cluster traffic. This is so that spikes in traffic do not impact the other. Wouldn’t want my guests without network if ceph is doing some rebalancing, or my storage to have impact if a client fully utilizes the uplink
.
As this is still a test setup I have not given them IPv6 yet but of course each node gets a static IPv6 when we move to production. I couldn’t get things to work to i simplified and temporarily reverted to IPv4 only.
network:
version: 2
ethernets:
enp5s0:
dhcp4: false
enp6s0:
dhcp4: false
bridges:
## Guest uplink
br0:
dhcp4: false
interfaces: [ enp5s0 ]
## Management, LXD and Ceph
br1:
dhcp4: false
interfaces: [ enp6s0 ]
addresses:
- 192.168.3.61/24
routes:
- to: default
via: 192.168.3.1
Hi @vosdev I’ll let @nikita-mezhenskyi reply re BGP setup, however one thing I noted in your setup notes is that you don’t need to setup a br0 bridge as use as the OVN uplink network, and instead you can specify enp5s0 as the OVN uplink interface directly.
Then LXD will create an OVS bridge for you and connect enp5s0 to it and then connect the bridge to OVN.
Otherwise currently in your setup the connectivity will be established as follows:
enp5s0 <-> br0 bridge <-veth pair-> OVS bridge <-> OVN
By specifying enp5s0 as the parent in your physical network definition in LXD you can remove the br0 bridge and the veth pair so it becomes:
enp5s0 <-> OVS bridge <-> OVN
Thanks! I will keep that in mind. I started out with a single interface setup so it was the only option because physical requires the interface to be fully unmanaged from the OS. If letting LXD fully handle the physical interface is better for latency/less complexity then i’m all for!
Therefore, you must specify either an unmanaged bridge interface or an unused physical interface as the parent for the physical network that is used for OVN uplink. The instructions assume that you are using a manually created unmanaged bridge. See How to configure network bridges for instructions on how to set up this bridge.
And I have also tried it with a managed bridge setup, like so:
root@node1:~# lxc network create uplink0 --type=bridge --target node1
Network uplink0 pending on member node1
root@node1:~# lxc network create uplink0 --type=bridge --target node2
Network uplink0 pending on member node2
root@node1:~# lxc network create uplink0 --type=bridge --target node3
Network uplink0 pending on member node3
root@node1:~# lxc network set uplink0 bridge.external_interfaces enp6s0 --target node1
root@node1:~# lxc network set uplink0 bridge.external_interfaces enp6s0 --target node2
root@node1:~# lxc network set uplink0 bridge.external_interfaces enp6s0 --target node3
root@node1:~# lxc network create uplink0 --type=bridge \
ipv4.address=192.168.3.1/24 \
ipv4.nat=false \
ipv6.address=none
Network uplink0 created
It seems there are 3 ways to rome ![]()
Also, the following isn’t mentioned on the OVN setup guide
physicalrequires the interface to be fully unmanaged from the OS.
Yes the benefit of using a (managed or unmanaged) bridge is that you can have the host present with an IP on the uplink network - which can be useful if you don’t have a spare physical port or VLAN/bond interface.
@vosdev I see that you have set up BGP listener on the LXD host (192.168.3.197). Could you try doing that on a cluster node instead?
For example, on the 192.168.3.61 you could try this config:
lxc config set core.bgp_address=192.168.3.61
lxc config set core.bgp_asn=65197
lxc config set core.bgp_routerid=192.168.3.61
And then, have your BGP router ( 192.168.3.1) config updated to connect to the BGP listener on 192.168.3.61.
Please let me know whether this setup routes traffic to your OVN network correctly.
@vosdev please show output of lxc network show ovn25 as in your original post
You OVN network is not showing any volatile address keys, which would represent the IPs of the OVN router on your uplink network.
This looks problematic to me.
Hey, sorry for the confusion
. The information in the original post was from a different test standalone node (192.168.3.197)
All three nodes in my cluster have BGP set up and succesfully announce every lxd network configured on the cluster.
root@node1:~# lxc config show --target node1 | grep bgp
core.bgp_address: 192.168.3.61
core.bgp_asn: "65061"
core.bgp_routerid: 192.168.3.61
root@node1:~# lxc config show --target node2 | grep bgp
core.bgp_address: 192.168.3.62
core.bgp_asn: "65061"
core.bgp_routerid: 192.168.3.62
root@node1:~# lxc config show --target node3 | grep bgp
core.bgp_address: 192.168.3.63
core.bgp_asn: "65061"
core.bgp_routerid: 192.168.3.63
root@node1:~# lxc query /internal/testing/bgp
{
"peers": [
{
"address": "192.168.3.1",
"asn": 65000,
"count": 1,
"holdtime": 0,
"password": ""
}
],
"prefixes": [
{
"nexthop": "192.168.3.5",
"owner": "network_17",
"prefix": "172.16.5.0/24"
},
{
"nexthop": "192.168.3.6",
"owner": "network_19",
"prefix": "172.16.6.0/24"
},
{
"nexthop": "0.0.0.0",
"owner": "network_12",
"prefix": "192.168.3.0/24"
}
],
"server": {
"address": "192.168.3.61",
"asn": 65061,
"router_id": "192.168.3.61",
"running": true
}
}
root@node1:~# lxc network ls
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| br-int | bridge | NO | | | | 0 | |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| enp5s0 | physical | NO | | | | 0 | |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| enp6s0 | physical | NO | | | | 0 | |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| lxdovn12 | bridge | NO | | | | 0 | |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| ovn5 | ovn | YES | 172.16.5.1/24 | | | 4 | CREATED |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| ovn6 | ovn | YES | 172.16.6.1/24 | fd42:11ef:adee:d29f::1/64 | | 0 | CREATED |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
| uplink0 | bridge | YES | 192.168.3.60/24 | none | | 2 | CREATED |
+----------+----------+---------+-----------------+---------------------------+-------------+---------+---------+
root@node1:~# lxc network show ovn5
name: ovn5
description: ""
type: ovn
managed: true
status: Created
config:
bridge.mtu: "1442"
ipv4.address: 172.16.5.1/24
ipv6.nat: "true"
network: uplink0
volatile.network.ipv4.address: 192.168.3.5
used_by:
- /1.0/instances/c1
- /1.0/instances/c2
- /1.0/instances/c3
- /1.0/profiles/ovn5
locations:
- node2
- node3
- node1
project: default
root@node1:~# lxc network show uplink0
name: uplink0
description: ""
type: bridge
managed: true
status: Created
config:
bgp.peers.opnsense.address: 192.168.3.1
bgp.peers.opnsense.asn: "65000"
ipv4.address: 192.168.3.60/24
ipv4.dhcp.ranges: 192.168.3.26-192.168.3.49
ipv4.nat: "false"
ipv4.ovn.ranges: 192.168.3.5-192.168.3.25
ipv4.routes: 172.16.0.0/16
ipv6.address: none
used_by:
- /1.0/networks/ovn5
- /1.0/networks/ovn6
locations:
- node1
- node2
- node3
project: default

(btw why is it also announcing the 192.168.3.0/24 network?
)
Please let me know whether this setup routes traffic to your OVN network correctly.
This volatile ipv4 address 192.168.3.5 is not reachable anywhere on my network. Not even from the lxd hosts themself.
You OVN network is not showing any volatile address keys, which would represent the IPs of the OVN router on your uplink network.
This looks problematic to me.
The ovn networks on my 3-node cluster do have this volatile address but it’s not reachable anywhere.
root@node1:~# ping 192.168.3.5
PING 192.168.3.5 (192.168.3.5) 56(84) bytes of data.
From 192.168.3.61 icmp_seq=1 Destination Host Unreachable
From 192.168.3.61 icmp_seq=2 Destination Host Unreachable
the uplink0 interface only has the 192.168.3.60/24 as configured in the LXD uplink0 network, not the 192.168.3.5 or 192.168.3.6 for ovn5 and ovn6
root@node1:~# ip a show uplink0
8: uplink0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:16:3e:11:cb:75 brd ff:ff:ff:ff:ff:ff
inet 192.168.3.60/24 scope global uplink0
valid_lft forever preferred_lft forever
I hope to have cleared the confusion with my .197 node from the first post. If there is any more information you require i’ll be happy to provide!
Because you’re using a managed bridge network for the uplink with bgp.peers settings, see:
https://documentation.ubuntu.com/lxd/latest/howto/network_bgp/#configure-next-hop-bridge-only
Oh! Another reason for going back to a bare physical link.
This is because you’re using a managed bridge network as the uplink.
This is arguably a bug in LXD, because its not an intended configuration setup to combine OVN BGP announcements and managed bridge networks.
Lets consider the configuration you have:
enp6s0 using br1 bridge with subnet 192.168.3.0/24 (I think) with each member having a unique IP on that network (e.g. 192.168.3.61).uplink0 that also has the range 192.168.3.0/24, with every cluster member having the same address 192.168.3.60.192.168.3.0/24, one via the external uplink interface br0 and one into the managed bridge uplink0.uplink0 announces routes to its own subnet using its peer source address as the next hop (because each cluster member should have its own address on the remote router’s network).uplink0 private bridge, which then causes route announcements to the OVN network’s volatile address on its uplink (uplink0 rather than br1).br1 network.Are enp5s0 and enp6s0 connected to the same physical network?
My recommendation would be to simplify.
If enp5s0 and enp6s0 are both connected to the same physical network segment, then you have a couple of choices:
physical network in LXD that uses that unnumbered interface. LXD will setup an OVS bridge for you and connect the unnumbered interface and OVN together.enp5s0 and enp6s0 for redundancy and then create a netplan bridge ontop of that (ideally an OVS bridge to avoid the additional veth pair interconnect) and use it both for management/cluster and as an OVN uplink. Then create an uplink physical network in LXD that uses that netplan bridge. LXD will detect the bridge and connect it to OVN, either by directly connecting to the OVS bridge or by setting up an OVS bridge and using a veth pair to connect to the native Linux bridge.I think you nailed it! All these 7 points are spot on.
I will re-create with a physical uplink, using a netplan ovs bridge and come back to you ![]()
- This will likely cause conflicting automatic routes on your LXD hosts, as you effectively have two routes to
192.168.3.0/24, one via the external uplink interfacebr0and one into the managed bridgeuplink0.
Yes
Are
enp5s0andenp6s0connected to the same physical network?
Yes, but one is 1gbit and the other is 10Gbit. The 10Gbit link is for Ceph and the 1Gbit link would be for OVN and node management. Ideally I would get LXD to also use the 10Gbit link for instance migration but that’s for later to figure out. (Some instances will run on local ZFS)
For now the goal is to get things to work. Thank you so much for the help so far, it’s been very detailed!
it works, both IPv4 and IPv6 fully routed. Using the right bridge it was actually pretty quick to get everything to work. I spent more time fixing ipv6 bgp and firewall issues on my router than setting up both OVN and LXD.
Reason for using a managed bridge was because of a conversation I had with @edlerd on GitHub because it was the “preferred” way, but I now reverted that decision and went with a bridge created in netplan.
This is arguably a bug in LXD, because its not an intended configuration setup to combine OVN BGP announcements and managed bridge networks.
Maybe this is something that can be blocked.
Thank you so much @tomp for your assistance! It seems the guide was enough after all.
I think a good addition to the current guide is to give some guidance related to a NAT free setup and point you in the direction of the BGP guide, plus some notes to get everything up & running with microovn. Your notes regarding the ups and downs of bridges and openvswitch bridges in netplan are also valuable information that could be added ![]()
I might move from microovn to local ovn in the future so that I can work with an openvswitch based bridge directly in netplan.
Below is a write-up with my complete setup, my config and the commands used to get this to work.
I have three nodes sharing a L2 network (192.168.3.0/24). node1, node2 and node3.
To prevent routing issues, the 10Gbit link has a subnet that is not routed on this L2 network.
network:
version: 2
ethernets:
enp5s0: {}
# 10Gbit; Ceph + LXD Cluster
enp6s0:
dhcp4: false
dhcp6: false
accept-ra: false
addresses:
- 10.0.0.61/24 # Local use only
bridges:
# 1Gbit; Management, OVN, LXD API
br-enp5s0:
#openvswitch:
# protocols: [OpenFlow13, OpenFlow14, OpenFlow15]
# external-ids:
# iface-id: node1
interfaces:
- enp5s0
addresses:
- 192.168.3.61/24
routes:
- to: default
via: 192.168.3.1
Because I am currently running OVN via the microovn snap, i am not able to make an OpenvSwitch out of this bridge because openvswitch cannot run both on my host and inside the snap, therefore I am using a standard linux bridge.
I have reserved 172.16.0.0/16 and 2001:db8:1234:abc0::/60 for this LXD cluster’s OVN networks.
Set up OVN using microovn:
snap install microovn
microovn cluster bootstrap
microovn cluster add node2
microovn cluster add node3
Configure LXD for use with this OVN:
lxc config set network.ovn.ca_cert="$(cat /var/snap/microovn/common/data/pki/cacert.pem") \
network.ovn.client_cert="$(cat /var/snap/microovn/common/data/pki/client-cert.pem)" \
network.ovn.client_key="$(cat /var/snap/microovn/common/data/pki/client-privkey.pem)"
lxc config set network.ovn.northbound_connection=ssl:192.168.3.61:6641,ssl:192.168.3.62:6641,ssl:192.168.3.63:6641
@nikita-mezhenskyi the pki steps are currently not in the docs and it’s hard to get the formatting right so I think these commands would be a good addition to the documentation. Also it might be good to include that network.ovn.northbound_connection allows comma separated values.
Next up, creating the uplink on each node:
lxc network create uplink0 --type=physical parent=br-enp5s0 --target=node1
lxc network create uplink0 --type=physical parent=br-enp5s0 --target=node2
lxc network create uplink0 --type=physical parent=br-enp5s0 --target=node3
lxc network create uplink0 --type=physical \
ipv4.ovn.ranges=192.168.3.5-192.168.3.25 \
ipv4.gateway=192.168.3.1/24 \
ipv4.routes=172.16.0.0/16 \
ipv6.ovn.ranges=2001:db8:1234:3::5-2001:db8:1234:3::25 \
ipv6.gateway=2001:db8:1234:3::1/64 \
ipv6.routes=2001:db8:1234:abc0::/60 \
dns.nameservers=1.1.1.1,2606:4700:4700::1111 \
bgp.peers.opnsense.address: 192.168.3.1 \
bgp.peers.opnsense.asn: "65000"
And now you’re able to create the virtual cluster networks:
lxc network create ovn1 --type=ovn \
ipv4.address=172.16.1.1/24 \
ipv4.nat="false" \
ipv6.address=2001:db8:1234:abc1::1/64 \
ipv6.nat="false" \
network=uplink0
lxc network create ovn2 --type=ovn \
ipv4.address=172.16.2.1/24 \
ipv4.nat="false" \
ipv6.address=2001:db8:1234:abc2::1/64 \
ipv6.nat="false" \
network=uplink0
Now we need to configure LXD’s BGP:
lxc config set core.bgp_asn=65060
lxc config set core.bgp_address=192.168.3.61 --target node1
lxc config set core.bgp_address=192.168.3.62 --target node2
lxc config set core.bgp_address=192.168.3.63 --target node3
lxc config set core.bgp_routerid=192.168.3.61 --target node1
lxc config set core.bgp_routerid=192.168.3.62 --target node2
lxc config set core.bgp_routerid=192.168.3.63 --target node3
Here’s a command to give you some debug data for the BGP setup if you run into any issues: (This works with --target)
lxc query /internal/testing/bgp
And finally, some changes were required on my router (OPNSense)
172.16.0.0/16 to NAT Outbound Traffic to enable NAT. By default OPNSense does not NAT your traffic if it is not locally configured but instead dynamically learned.172.16.0.0/16 and 2001:db8:1234:abc0::/60 to your vlan’s interface firewall inbound traffic.root@c1:~# ip a show eth0
16: eth0@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue state UP group default qlen 1000
link/ether 00:16:3e:3e:a2:2e brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 172.16.1.2/24 metric 100 brd 172.16.1.255 scope global dynamic eth0
valid_lft 3142sec preferred_lft 3142sec
inet6 2001:db8:1234:abc1:216:3eff:fe3e:a22e/64 scope global mngtmpaddr noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::216:3eff:fe3e:a22e/64 scope link
valid_lft forever preferred_lft forever
root@c1:~# ping google.com -c 1
PING google.com (2a00:1450:400e:801::200e) 56 data bytes
64 bytes from ams17s10-in-x0e.1e100.net (2a00:1450:400e:801::200e): icmp_seq=1 ttl=118 time=7.30 ms
--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.299/7.299/7.299/0.000 ms
root@c1:~# ping google.com -c 1 -4
PING google.com (142.251.36.46) 56(84) bytes of data.
64 bytes from ams17s12-in-f14.1e100.net (142.251.36.46): icmp_seq=1 ttl=118 time=11.3 ms
--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 11.302/11.302/11.302/0.000 ms
Yes I think this is a known issue that @fnordahl is aware of.
Excellent glad you got it sorted ![]()
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.