Struggling with networking

So I’m struggling trying to get microcloud initialized, my network just isn’t right…

I have 4 identical Ubuntu Server 22.04 systems with dual NICs. NIC eno1 has a static ip address 192.168.1.31 with the gateway pointed at my network and has the internet connectivity, all is good there.

The second NIC is enp4s0. It has no assigned ip address, but it (and the other 3 nodes) are connected to a switch that has no other upstream connectivity.

My netplan yaml file is:

network:
  renderer: networkd
  ethernets:
    eno1:
      dhcp4: false
      addresses:
      - 192.168.1.31/24
      nameservers:
        addresses:
        - 192.168.1.1
        search:
        - dnebinger.com
      routes:
      - to: default
        via: 192.168.1.1
    enp4s0:
      dhcp4: false
      dhcp6: false
      link-local: []
      mtu: 1600
  version: 2

When I use the sudo microcloud init command, it only lists the eno1 card, not the enp4s0.

If I pick that (which I don’t want to do), either the ipv4 or ipv6 addresses, it then asks to search, at which point it hangs.

Using tcpdump on the other 3 nodes, I can see that the mDNS traffic is going back and forth, but there are no errors reported by avahi-daemon and no clear indication what is wrong.

I’m fairly certain that it is my netplan config, but I haven’t found a good reference for setting it up for microcloud to accept it, it’s really just hobbled together things hinted at in other online places.

It would be great if someone had some insight as to what I need to fix/configure so the microcloud init can find and use the enp4s0 card correctly, that would be a huge help!

1 Like

Do the 4 systems all have the same hostname? If you’re seeing the mDNS traffic go through, then that might be the problem. Each system should have a unique hostname for MicroCloud to pick them up.

As for why it’s not showing enp4s0 for system lookup, that’s because it doesn’t have any addresses assigned. The selected lookup interface needs to have a default subnet to perform the lookup over.

No, they’re named micro1, micro2, micro3, and micro4.

There’s the note in the instructions though: " Configure the network interface connected to microbr0 to not accept any IP addresses (because MicroCloud requires a network interface that doesn’t have an IP address assigned):", so I didn’t think anything should be assigned to enp4s0…

That’s for the uplink interface to use for the OVN network after configuring MicroCloud, that one should have no addresses assigned: https://canonical-microcloud.readthedocs-hosted.com/en/latest/tutorial/get_started/#create-a-network

The interface you pick in the beginning is the one to perform the mDNS lookup of machines to setup MicroCloud on. That one does need addresses assigned and all the systems should be reachable by multicast.

Okay, so I think I have a working netplan because microcloud init completed successfully.

This is what I went with:

network:
  renderer: networkd
  ethernets:
    eno1:
      dhcp4: false
      addresses:
      - 192.168.1.31/24
      nameservers:
        addresses:
        - 192.168.1.1
        search:
        - dnebinger.com
      routes:
      - to: default
        via: 192.168.1.1
    enp4s0:
      dhcp4: false
  bridges:
    br0:
      interfaces: [enp4s0]
      mtu: 1600
      addresses:
      - 10.10.10.31/24
      parameters:
        stp: false
        forward-delay: 0
  version: 2

Okay, so the bridge definitely works, the vms created within the lxc containers can all talk to each other just fine…

However, they have no internet access, which sure, I get because the bridge only uses the internal network card, not the external eno1…

I will have to play with the bridge to try and add the internet access… Fun times!

Any ideas or suggestions are more than welcome!

1 Like

I do not think you need to have the bridge setup in netplan if you want to use microovn. But the interface you give to microovn should be connected to a valid network. In short, it needs to be unconfigured, but if you were to configure it, it should work and have internet access.

In my setup, the interface I gave OVN is a VLAN that is on subnet 10.10.11.0/24, and my router listens on that VLAN with IP 10.10.11.1. So when answering the questions for OVN, I told it to use the gateway IP 10.10.11.1/24, the first IP to use as 10.10.11.51 and the last as 10.10.11.254. That worked for me and instances get internet access when using the default profile.

Well, that’s the thing…

If I try to microcloud init with this netplan config, it fails to detect and initialize:

network:
  renderer: networkd
  ethernets:
    eno1:
      dhcp4: false
      addresses:
      - 192.168.1.31/24
      nameservers:
        addresses:
        - 192.168.1.1
        search:
        - dnebinger.com
      routes:
      - to: default
        via: 192.168.1.1
    enp4s0:
      dhcp4: false
      mtu: 1600
      addresses:
      - 10.10.10.31/24
      routes:
      - to: 10.10.10.0/24
        via: 10.10.10.31
        metric: 100
  version: 2

However, if I switch to the bridge version, then it finds and initializes correctly.

That’s why I titled the thread “struggling with networking” because configurations that should work don’t seem to, and configurations that shouldn’t be necessary are.

That plus different online recommendations (i.e. changing the MTU size) and setting dhcp6 to false or using link-local: [], they just didn’t seem to have a direct impact.

The bridge version was the only route to success so far…

In short, [the nic used for microovn] needs to be unconfigured, but if you were to configure it, it should work and have internet access.

Ah, now that is something that had not been pointed out before. The nic I’m using goes to a switch joining the 4 systems but w/o internet access.

And maybe that’s why the bridge configuration works where the direct card does not, the bridge must emulate having the right access for discovery and usage.

Hi @dnebing

I wanted to take some time to explain how MicroCloud networking operates and hopefully correlate that to your setup.

The microcloud init has two different network related phases.

The first stage asks you to “Select an address for MicroCloud’s internal traffic”.
This will list all network interfaces that are up and configured with an IP address.
This is used for intra-cluster traffic, including MicroCloud’s own mutlicast discovery process, as well as OVN and Ceph traffic between cluster members.
At this time it is not possible to specify different interfaces for OVN and Ceph intra-cluster (east-west) traffic, but this is on our roadmap.

This first interface must be multicast compatible (i.e a real layer 2 network) and be connected to the other MicroCloud cluster members.

The second networking stage asks you to "Configure distributed networking? (yes/no) [default=yes]: " and then asks you to pick “Select exactly one network interface from each cluster member:”.

This second stage is asking you to configure which network interface should be used for the so-called Uplink interface, which is used to connect the OVN network(s) created to the external network (north-south traffic), and potentially the internet.

These interfaces must be either:

  • Unused (no IPs configured on them).
  • A bridge interface.

The reason for this is that in order to connect a physical interface to the OVN virtual router that is created for each OVN network that interface needs to be connected to an associated OVS bridge on the host. LXD will do this for you, but the act of doing this will render any IPs configured on the interface inactive, so in order to avoid accidentally disconnecting yourself we don’t allow interfaces with IPs configured to be used.

The exception here is if the interface is a bridge already, because in that case we can connect that bridge to the OVS bridge using a virtual veth connection.

This uplink interface doesn’t have to be the same interface on every cluster member, which is why you can pick a different one for each member, but they must be connected to the same layer 2 network (which doesn’t necessarily have to be the same as as used for the internal traffic).

The reason for this is that OVN will pick a single cluster member to use as the active chassis when routing traffic to/from the uplink network. Each OVN network will get a virtual router and its virtual external interface will be assigned an IP address on the uplink network from the range provided during the initialisation.

If one of the cluster members goes down, then OVN will move that virtual router’s IP to a different cluster member by way of ARP/NDP adverts. So the uplink interfaces must be on the same layer 2.

So hopefully that helps you understand what is going on under the hood here.

@maria-seralessandri @masnax would it be possible to improve the microcloud init process to make it clearer what the selected interface in the OVN setup phase is going to be used for? As currently it doesn’t say why the user is being asked to pick an interface, nor is it clear that this interface must be unused or a bridge.

4 Likes

Wow, there’s a lot of detail there, I greatly appreciate that you’ve shared it.

Whether I understand it or not, well that will be more of a question of time to digest and sort it out.

It does suggest to me, though, that perhaps Microcloud is not intended maybe for someone not as versed in the networking details as presented here.

Or, if it is, that perhaps more or better documentation on initial network setup (i.e. sample netplan configs for single NIC systems, multi-NIC systems, etc) could help fill in the gaps. That was certainly where I was struggling, trying to find the right netplan config for microcloud init to complete like it did in the documentation…

You’re welcome :slight_smile:

MicroCloud is primarily intended to be a way to deploy a highly available private cloud using LXD, Ceph and OVN in an opinionated way.

Initially this means that each member requires:

  • 2x NICs - one configured, that will be used internal cluster traffic, and one unconfigured (or a single shared bridge) that can be used for OVN network external connectivity.
  • 2x empty hard drives - one for local ZFS storage, one for remote Ceph storage.

These opinionated requirements though will become more flexible in time though.
It is already possible, for instance, to choose not to configure local storage during microcloud init and then later configure LXD directly with a storage pool driver of your choice, which means you can then choose to use an existing partition or directory.

We do have plans to allow support for partitions in microcloud init also, but there are some technical issues we need to resolve first.

We also have plans to add support for specifying a different configured network interfaces for the Ceph and OVN cluster traffic so that those can be separated if needed (for performance or security reasons).

I agree the documentation could do with being expanded and the overall architecture explained more clearly. We have an item this cycle to add network diagrams which should help here. I agree adding example netplan configs would be useful too.

Following on from this thread we have are making a change to the microcloud init questions to clarify what the 2nd (unused) network interface will be used for, as I don’t feel that is clear at the moment:

https://github.com/canonical/microcloud/pull/234/files

1 Like

Opinionated I’m fine with… I mean, I know you go a direction like this because you want a solution but you don’t want to have to become an expert in LXD, Ceph and OVN to get there.

I was not aware of the 2x NIC and 2x empty drive requirements, I don’t know that’s spelled out anywhere on the Microcloud documentation. The partition issue I got help from another community member on that, so that was the easiest part to work around, but the networking side definitely threw me for a loop.

That said, I’m excited to see where this goes…

1 Like