LXD unix socket not accessible - zfs missing even though I am using btrfs

I am running LXD 5.21.1 LTS on top of ubuntu 20.4. via snap and could not access anything any longer

symptom:

sysop@ubn2004LXD:~$ lxc list
Error: LXD unix socket not accessible: Get "http://unix.socket/1.0": EOF

checking the daemon

sysop@ubn2004LXD:~$ sudo systemctl status snap.lxd.daemon
...
...
Jun 20 23:12:41 ubn2004LXD lxd.daemon[24852]: time="2024-06-20T23:12:41+02:00" level=warning msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored"
Jun 20 23:12:41 ubn2004LXD lxd.daemon[24852]: time="2024-06-20T23:12:41+02:00" level=warning msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
Jun 20 23:12:41 ubn2004LXD lxd.daemon[24852]: time="2024-06-20T23:12:41+02:00" level=warning msg="Instance type not operational" driver=qemu err="KVM support is missing (no /dev/kvm)" type=>
Jun 20 23:12:43 ubn2004LXD lxd.daemon[24852]: time="2024-06-20T23:12:43+02:00" level=error msg="Failed loading storage pool" err="Required tool 'zpool' is missing" pool=pool-52G
Jun 20 23:12:43 ubn2004LXD lxd.daemon[24852]: time="2024-06-20T23:12:43+02:00" level=error msg="Failed to start the daemon" err="Failed applying patch \"storage_move_custom_iso_block_volume>

now, the zpool error lead me to:
LXD 5.12: ZFS stopped working in lxd - Error: Required tool 'zpool' is missing when kernel ZFS module version < 0.8:
however I am not using zfs but btrfs, so the metioned pool-52G is a btrfs powerd pool

what worries me a bit is:

sysop@ubn2004LXD:~$ sudo ls -l /var/snap/lxd/common/lxd/storage-pools/pool-52G/containers/erpnext4
total 0

or

sysop@ubn2004LXD:~$ sudo du -h -d 1 /var/snap/lxd/common/lxd/storage-pools/pool-52G/containers
4.0K	/var/snap/lxd/common/lxd/storage-pools/pool-52G/containers/ubn2204
4.0K	/var/snap/lxd/common/lxd/storage-pools/pool-52G/containers/erpnext4
12K	/var/snap/lxd/common/lxd/storage-pools/pool-52G/containers

After having installed ZFS and running sudo apt-get install --install-recommends linux-generic-hwe-20.04 nevertheless the symptom LXD unix socket not accessible: Get "http://unix.socket/1.0": EOF stays, but the error in the system check changes to

Jun 22 20:11:23 ubn2004LXD lxd.daemon[65683]: => Starting LXD
Jun 22 20:11:23 ubn2004LXD lxd.daemon[65821]: time="2024-06-22T20:11:23+02:00" level=warning msg=" - Couldn't find the CGroup blkio.weight, disk priority will be ignored"
Jun 22 20:11:23 ubn2004LXD lxd.daemon[65821]: time="2024-06-22T20:11:23+02:00" level=warning msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
Jun 22 20:11:23 ubn2004LXD lxd.daemon[65821]: time="2024-06-22T20:11:23+02:00" level=warning msg="Instance type not operational" driver=qemu err="KVM support is missing (no /dev/kvm)" type=virtual-machine
Jun 22 20:11:24 ubn2004LXD lxd.daemon[65821]: time="2024-06-22T20:11:24+02:00" level=error msg="Failed loading storage pool" err="Required tool 'zpool' is missing" pool=pool-52G
Jun 22 20:11:24 ubn2004LXD lxd.daemon[65821]: time="2024-06-22T20:11:24+02:00" level=error msg="Failed to start the daemon" err="Failed applying patch \"storage_move_custom_iso_block_volumes_v2\": Failed loading pool \"pool-52G\": Required tool 'zpool' is missing"

That pool pool-52G is a btrfs pool to begin with.

The most worrying fact is that /var/snap/lxd/common/lxd/storage-pools/pool-52G/containers/erpnext4 (erpnext4 being my production container) does not contain anything.

So, any idea on what can I do now?

That’s indeed weird that your pool-52G would be miss-identified. Could you please double check by running blkid on the backing files you’ll find in /var/snap/lxd/common/lxd/disks?

Here’s what I have locally for a zfs and a btrfs storage pools:

$ sudo blkid /var/snap/lxd/common/lxd/disks/juju-zfs.img
/var/snap/lxd/common/lxd/disks/juju-zfs.img: LABEL="juju-zfs" UUID="10257962908284069211" UUID_SUB="9214883900141444860" BLOCK_SIZE="512" TYPE="zfs_member"
$ sudo blkid /var/snap/lxd/common/lxd/disks/juju-btrfs.img
/var/snap/lxd/common/lxd/disks/juju-btrfs.img: LABEL="juju-btrfs" UUID="c869b9e0-8aa3-4247-93a8-d77624f854af" UUID_SUB="2a38ab00-2eed-4a67-8c2c-c34618a0ab0f" BLOCK_SIZE="4096" TYPE="btrfs"

thx for the feedback. However what I am getting does not give me much hope I’ll be able to recover.

sysop@ubn2004LXD:~$ sudo ls -l /var/snap/lxd/common/lxd/disks
total 0

I picked up some comments here and there implying btrfs being a deathtrap of some sort, at least in regards to lxd. Looks like I might have fallen into it.

And actually on my laptop I set up an incus instance a while ago where something very similar (disappearing storage-pool) has happened. I did not spent much time looking into this I am afraid so it’s maybe just my own costly learning.

That’s possibly because you never used loop devices. You can check for other block devices/partitions that are identified as ZFS or btrfs using this command:

$ sudo blkid | grep -E '"(btrfs|zfs_member)"'

It’s true that btrfs has some caveats but it should still provide you with a reliable experience.

hm, that is really unexpected

$ blkid | grep -E '"(btrfs|zfs_member)"'
/dev/sdc1: LABEL="pool-52G" UUID="11910157946427829579" UUID_SUB="9603014261412617731" TYPE="zfs_member" PARTLABEL="zfs-31ecaa2e2cf2124c" PARTUUID="5301f68f-282b-ac4b-a527-b1b60e6d29f0"
/dev/mapper/vg0-lxdPool: LABEL="default" UUID="d9320138-3142-4e44-93bd-a0e9ccbfef4a" UUID_SUB="e0ef6ad3-5cfe-453f-8699-389a00021da2" TYPE="btrfs"

until just now, trying to get up’n’runnin again, I had install zfs to begin with. So the mentioned /dev/sdc1 can not be part of a zfs pool really

Or am I misinterpreting something here?

If LXD is now running please can you show the output of:

lxc storage show pool-52G 

unfortunately not

$ lxc storage show pool-52G
Error: LXD unix socket not accessible: Get "http://unix.socket/1.0": EOF

Lets have a look at the last state of your LXD database’s storage pools tables so we can get some clarity around your specific setup.

Please can you run:

sudo snap stop lxd
sudo sqlite3 /var/snap/lxd/common/lxd/database/global/db.bin
.headers on
select * from storage_pools;
select * from storage_pools_config;