Help with routing rules in a cloud servers based LXD cluster setup

My setup: LXD cluster running on 3 cloud servers on Hetzner, with a private network between them.

Defining a bridge is not possible, so that automatic assignment of private IPs in the 10.0.0.0 range is not an option and OVN neither.

As a workaround, it is possible to use “Alias IPs”, up to 5 per server, and assign them to the containers. This works with the solution described here, with a routed network with parent interface the one assigned to the private network.

My goal is to have containers with a shared network and with connectivity to the Internet. So that I attached another network interface to the profile to be applied to the containers:

# lxc profile show myprofile
devices:
  eth0:
    nictype: bridged
    parent: lxdbr0
    type: nic
  eth1:
    nictype: routed
    parent: enp7s0
    type: nic

# lxc init images:ubuntu/22.04/default c1 --profile myprofile --target node01
# lxc config device override c1 eth1 ipv4.address=10.0.0.11
# lxc start c1
# lxc init images:ubuntu/22.04/default c2 --profile myprofile --target node02
# lxc config device override c2 eth1 ipv4.address=10.0.0.21
# lxc start c2

At this point I can ping the containers on their private addresses but I cannot access the internet from them.

# lxc exec c1 ip r
default via 169.254.0.1 dev eth1
default via 10.46.229.1 dev eth0 proto dhcp src 10.46.229.32 metric 100
10.46.229.0/24 dev eth0 proto kernel scope link src 10.46.229.32 metric 100
10.46.229.1 dev eth0 proto dhcp scope link src 10.46.229.32 metric 100
169.254.0.1 dev eth1 scope link

If I add an entry for eth1 in the Netplan file, and do a netplan apply also access to the internet is possible, however on container restart I need to manually run a netplan apply.

This is the “correct” routing table:

# lxc exec c1 ip r
default via 10.46.229.1 dev eth0 proto dhcp src 10.46.229.32 metric 100
10.46.229.0/24 dev eth0 proto kernel scope link src 10.46.229.32 metric 100
10.46.229.1 dev eth0 proto dhcp scope link src 10.46.229.32 metric 100

So the question is how I can obtain this routing table without manually running netplan apply? Or is there a way to avoid using 2 network interfaces?

Does anyone have some ideas on how to solve this scenario? I can provide more information if needed. Thanks in advance!

I suspect you may well be able to use OVN as an overlay between the 3 cluster servers using their own IPs on the internal network for the geneve overlay tunnels, as its just unicast traffic AFAIK.

But going back to your current setup of using routed NIC, is there a specific reason why your containers are connected both via the routed NIC to enp7s0 and to lxdbr0 via a bridged NIC?

Have you considered not connecting the containers to lxdbr0, and instead just have them connect to enp7s0. You can then add an SNAT/MASQUERADE rule to your host firewalls manually in order to NAT egress out of the external interface to the host’s IP.

You could still configure your containers to use the DNS services provided by lxdbr0 by setting them to use the lxdbr0 network address as the resolver, or you could use an ISP provided DNS server and remove lxdbr0 entirely.

Thanks @tomp!

I suspect you may well be able to use OVN as an overlay between the 3 cluster servers using their own IPs on the internal network for the geneve overlay tunnels, as its just unicast traffic AFAIK.

This would be the preferred option. Do you happen to have more details on how this can be achieved? I understand that I wouldn’t need a bridge interface?

But going back to your current setup of using routed NIC, is there a specific reason why your containers are connected both via the routed NIC to enp7s0 and to lxdbr0 via a bridged NIC?

The reason is that eth0 (based on enp7s0) gives connectivity to the Internet and eth1 (based on lxdbr0) gives internal, cross-node connectivity. But you suggest that a nft (or iptables) rule can be created in the host to provide Internet connectivity via enp7s0? Can you provide some guidance to do that? Eliminating eth1 would be a good achievement.

Are you sure thats the right way around? lxdbr0 won’t be providing cross-node comms.

Please can you show ip a and ip r on the host?

Are you sure thats the right way around? lxdbr0 won’t be providing cross-node comms.

Sorry I switched eth0 with eth1. In any case I followed your recommendations and managed to delete lxdbr0. I have created a nftables configuration file, made it persistent across reboots and replicated in all the LXD cluster nodes.

The netplan file of the containers looks like:

network:
  version: 2
  ethernets:
    eth0:
      dhcp4: false
      addresses:
             - 10.0.0.11/32
      routes:
        - to: default
          via: 169.254.0.1
          metric: 100
          on-link: true
      nameservers:
        addresses:
          # Hetzner recursive DNS
          - 185.12.64.1

So, it seems that both the device override config option and the statically defined route are needed.

Please can you show ip a and ip r on the host?

Not sure if this is still relevant (maybe to get some advice on the OVN setup?):

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 96:00:02:65:66:9f brd ff:ff:ff:ff:ff:ff
    inet MY.PUB.LIC.IP/32 metric 100 scope global dynamic eth0
       valid_lft 84533sec preferred_lft 84533sec
    inet6 2a01:4f8:c17:3738::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::9400:2ff:fe65:669f/64 scope link 
       valid_lft forever preferred_lft forever
3: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000
    link/ether 86:00:00:53:0a:c2 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.30/32 brd 10.0.0.30 scope global dynamic enp7s0
       valid_lft 84536sec preferred_lft 84536sec
    inet6 fe80::8400:ff:fe53:ac2/64 scope link 
       valid_lft forever preferred_lft forever
4: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 72:1b:93:b8:05:f7 brd ff:ff:ff:ff:ff:ff
5: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether 7a:c2:5f:be:28:24 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::d0c1:b5ff:fe9d:d7c8/64 scope link 
       valid_lft forever preferred_lft forever
6: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 12:0a:42:9d:10:e6 brd ff:ff:ff:ff:ff:ff
8: veth3f81c772@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 72:4b:bd:85:26:23 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 169.254.0.1/32 scope global veth3f81c772
       valid_lft forever preferred_lft forever
    inet6 fe80::704b:bdff:fe85:2623/64 scope link 
       valid_lft forever preferred_lft forever
12: vethc3d9f275@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 2a:2d:94:5d:66:6a brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet 169.254.0.1/32 scope global vethc3d9f275
       valid_lft forever preferred_lft forever
    inet6 fe80::282d:94ff:fe5d:666a/64 scope link 
       valid_lft forever preferred_lft forever
       
       
# ip r
default via 172.31.1.1 dev eth0 proto dhcp src MY.PUB.LIC.IP metric 100 
10.0.0.0/16 via 10.0.0.1 dev enp7s0 
10.0.0.1 dev enp7s0 scope link 
10.0.0.31 dev veth3f81c772 scope link 
10.0.0.32 dev vethc3d9f275 scope link 
172.31.1.1 dev eth0 proto dhcp scope link src MY.PUB.LIC.IP metric 100 
185.12.64.1 via 172.31.1.1 dev eth0 proto dhcp src MY.PUB.LIC.IP metric 100 
185.12.64.2 via 172.31.1.1 dev eth0 proto dhcp src MY.PUB.LIC.IP metric 100
1 Like

We have this guide for setting up a LXD cluster with OVN networking if that helps :slight_smile: https://documentation.ubuntu.com/lxd/en/latest/howto/network_ovn_setup/#set-up-a-lxd-cluster-on-ovn

Thanks @markylaing. I successfully followed that guide on another system, by configuring a bridge.
However on the system referred in this post, bridging is not possible.

In that guide we read that “you must specify either an unmanaged bridge interface or an unused physical interface”, so not sure whether there is some other way around?

Ah I missed that you cannot create a bridge. No there is no way around this unfortunately so I don’t think OVN is possible.

So going back to the routed nic:

It’s possible that your netplan need to include link-local: [ "ipv4", "ipv6" ] in your netplan: https://netplan.readthedocs.io/en/latest/netplan-yaml/#properties-for-all-device-types