LXD Cluster - separate network address confusion

jbrandt · July 23, 2023, 4:01am

I am confused by the LXD docs that refer to managing a cluster wherein separate cluster and core https addresses are used.

I have a 5-member LXD cluster running on five physical hosts with Debian 11 and snap 2.59.5 and lxd 5.15. Each host machine has multiple NICs - two of which I have configured for LXD use as bridges on each machine.

Since one NIC runs at 20 Mbit and the other at 1 Mbit, I would prefer to use the higher speed NIC for cluster comms and the lower speed one for API access. Is this the right way to think about it? The higher speed link being used for tasks that require it - like database sync, migrations, moving volumes, etc?

Currently, all of my cluster members have addresses on the 10.0.100.x\24 subnet and those addresses (like 10.0.100.20:8443) are used for both core.https_address and cluster.https_address. That subnet is linked to the high speed NIC.

I have another subnet that the 1 Mbit NICs are on: 10.0.99.x\24 and all five hosts have that network defined and can ping each other’s 99.x addresses. When I tried changing one member’s core.https_address to 10.0.99.50 instead of 10.0.100.50 - while leaving it’s cluster address unchanged - it fell out of the cluster and shows up in lxc cluster list as OFFLINE.

The docs say “You can configure different networks for the REST API endpoint of your clients and for internal traffic between the members of your cluster.” The note on that page says, “core.https_address is specific to the cluster member, so you can use different addresses on different members.”

So, for now, all of my cluster members have the same IP addresses defined for both core and cluster. How do you separate them without breaking the cluster?

– Update
Related to my initial question, can someone tell me which https address type (core or cluster) is set when initially joining a cluster via lxd init?

This question is asked “What IP address or DNS name should be used to reach this server?” during the init process. I presume that LXD sets either core or cluster (or both) https addresses to the IP address entered here.

jbrandt · July 23, 2023, 10:39am

Well, I think I found an answer buried in the docs under Recover a Cluster.

The subsection on that page entitled “Recover cluster members with changed addresses” shows how to use the sudo lxd cluster edit command to manually change addresses and roles after stopping lxd on each node.

I may give this a try. Unfortunately, I’ve already reverted my cluster back to five individual non-clustered machines. I will need to decide if I want to run these as five separate LXD servers or if the benefits of clustering outweigh the complexity. I’m about to configure Ceph, so we’ll see if using a shared file system will tip the decision one way or the other.

tomp · July 24, 2023, 7:10am

Hi

core.https_address is used for the API (and can be a wildcard address) and cluster.https_address is used for intra-cluster communication (and must be a single IP).

jbrandt · July 24, 2023, 10:20am

Thanks, Tom

After some clean up, I was able to run lxd init again on all nodes. After all of the members joined the cluster, I then changed core.https_address on each of them.

Just to make sure it stuck, I shut down the lxd daemon on each node and performed an lxd cluster edit to ensure all nodes had the right cluster addresses. After restarting, all is working well.