LXD Cluster with TrueNAS Storage Pools

My team and I are designing an LXD Cluster.

We are running Ubuntu 24.04 LTS on Dell R640, with 1.5 TB of RAM and 2x 2TB SATA SSD Drives in a RAID 1 (Mirror) for OS. Connection is SFP28 25Gb Fiber, using Broadcom Fiber Cards.

Network works great and is blazing fast.

The objective was to use the TrueNAS Device for the LXD Cluster containers, similar to that of the “Data Store” in VMware.

We have set up the IOs on the TrueNAS. We have set up the NFS Share. The NFS Share is mounted on each of the Ubuntu Cluster Nodes (Metal Machines installed with Ubuntu and LXD). The storage pool is set up as a ‘directory pool type’ and can be created in LXD for the NFS Share, but it cannot be used.

Have tested on two identical machines with the same outcome.

Any thoughts on what is being missed here?

I can’t give you a solution to your current problem but I just wanted to say that running VMs and LXCs over NFS will give you terrible performance. The dir storage driver is also very limited.

In the world of qemu/kvm, Ceph is the default choice for a remote datastore. Basically every KVM based hypervisor supports Ceph. This will also allow you to more easily migrate between other kvm based hypervisors. Our migration from OpenNebule to LXD and Proxmox was super easy because all 3 are kvm based and support Ceph.

So my suggestion for you would be to consider Ceph if you want to use remote and high available storage. That will perform more than a lot better than everything NFS based and will also be HA.

The downside is that it will mean a new and slightly complex storage system to manage, but if you plan on filling the 1.5TB worth of memory with LXC and VMs then either local storage or Ceph will be your only viable options.

If you plan on sticking to one host, local storage (ZFS) will be the clear winner in terms of cost, complexity and performance.

2 Likes

Thank you for your remarks @vosdev

The purpose of this project is to convert away from a small VMware vSphere Cluster, to a Small Ubuntu LXD Cluster. The current VMware vSphere Cluster Contains three metal servers, with 8x 2Tb SATA SSD Drives. Each metal machine has it’s own local storage, but though vSphere, any VM can be moved off of any metal machine to any other metal machine in a matter of minutes. There is also a feature of vSphere allowing movement of a live VM to a different metal server, undetected to the users, except for a small performance decrease.

I want to replicate this functionality using Ubuntu and LXD. However, I’m not prepared to use Proxmox and CEPH.

Thoughts?

The one downside to using local storage is that you cannot LIVE migrate them between hosts afaik. You can do offline migration though — LXD will stop the VM, copy the data over the network, then start it on the new host.

No need to use proxmox, we actually migrated away from proxmox to LXD because of the lack of features and support.

My recommendation is still with setting up Ceph. Ceph nowadays is relatively easy set up and managed with cephadm. It’s completely self managing and self healing.

ceph orch upgrade start --version x.y.z. and you’ll go to a new version without downtime.
Or you can pull a broken disk and add a new one and it will automatically adapt the disk to the storage cluster. No configuration required.

I would not have recommend Ceph when it was version 14 and older to anyone who had never used it because it was very complex and required a LOT of manual maintenance. The new orchestrator automates most of the tasks for you since 15 and the latest version is 19. With a cluster of 6 disks you won’t need to do any fine tuning.

Give it a try, create 3 VMs with the latest Ubuntu 24.04 and give each of them 2 additional 10GB or 50GB disks and see for yourself :slight_smile: . You could even install LXD as well to test configuring LXD together with Ceph.

The benefits are big. You will have 3 copies of your VM storage. If node1 fails because of power issues your VMs can immediately start on node2 and node3 because the data is already there. The redundancy is way higher than just local raid. Minimum downtime.

You can mark a host down for maintenance and LXD will migrate your VMs away in seconds. You can then perform maintenance to your node. After maintenance all VMs will come back to the node.

The only data that has to move between nodes is the memory of the VMs and with 25Gbps adapters it will empty your host in less then a minute. ( for example 1Tbit of memory in use = 40 seconds)

Canonical also provides Ceph as a snap (MicroCeph). If you’re deploying your cluster fresh, I recommend deploying with MicroCloud as it will automate the Ceph and OVN setup. It works well for my homelab.

I would note that because of Ceph’s replicas you’ll get higher write latency. LXD supports using both local and Ceph storage for different machines/volumes though, so you don’t necessarily need to make that trade-off for the whole cluster, just for individual machines/volumes.

Yes, I also do this. My clustered/redundant VMs and LXCs run off local ZFS storage. The not-so-redundant and mission-critical VMs and LXCs are stored on Ceph.

No matter how luxurious your setup, Ceph will always be slower because it is network storage.