Microcloud init: failed to bootstrap OSD

Hello,

I’m getting started with MicroCloud and as an initial environment I have setup a VM for a single-node microcloud with the following instructions:

lxc launch ubuntu:24.04 testmicrocloud --vm -c limits.cpu=4 -c limits.memory=8GiB
lxc storage volume create lxdpool01 testmicrocloud-vol01 --type block size=10GiB
lxc storage volume create lxdpool01 testmicrocloud-vol01 --type block size=10GiB
lxc storage volume attach lxdpool01 testmicrocloud-vol01 testmicrocloud
lxc storage volume attach lxdpool01 testmicrocloud-vol02 testmicrocloud
lxc network attach lxdbr1 testmicrocloud # lxdbr0 is attached already

Followed installation instructions in the VM here - https://canonical-microcloud.readthedocs-hosted.com/en/latest/microcloud/how-to/install/#installation

Then inside the VM:

root@testmicrocloud:~# lsblk
NAME     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
loop0      7:0    0  73.9M  1 loop  /snap/core22/1802
loop1      7:1    0  66.2M  1 loop  /snap/core24/739
loop2      7:2    0 114.4M  1 loop  /snap/lxd/33110
loop3      7:3    0 111.5M  1 loop  /snap/microceph/1293
loop4      7:4    0  21.1M  1 loop  /snap/microovn/667
loop5      7:5    0  10.4M  1 loop  /snap/microcloud/1144
loop6      7:6    0  44.4M  1 loop  /snap/snapd/23771
sda        8:0    0    10G  0 disk  
├─sda1     8:1    0     9G  0 part  /
├─sda14    8:14   0     4M  0 part  
├─sda15    8:15   0   106M  0 part  /boot/efi
└─sda16  259:0    0   913M  0 part  /boot
sdb        8:16   0    10G  0 disk  
└─mpatha 252:0    0    10G  0 mpath 
sdc        8:32   0    10G  0 disk  
└─mpatha 252:0    0    10G  0 mpath 
root@testmicrocloud:~# microcloud init
Waiting for services to start ...
Do you want to set up more than one cluster member? (yes/no) [default=yes]: no
Using address "10.0.0.2" for MicroCloud
Gathering system information ...
Would you like to set up local storage? (yes/no) [default=yes]: no
Would you like to set up distributed storage? (yes/no) [default=yes]: yes
Select from the available unpartitioned disks:

Select which disks to wipe:

Disk configuration does not meet recommendations for fault tolerance. At least 3 systems must supply disks.
Continuing with this configuration will inhibit MicroCloud's ability to retain data on system failure
Change disk selection? (yes/no) [default=yes]: no

 Using 1 disk(s) on "testmicrocloud" for remote storage pool

Do you want to encrypt the selected disks? (yes/no) [default=no]: no 
Would you like to set up CephFS remote storage? (yes/no) [default=yes]: yes
What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph internal traffic on? [default: 10.0.0.0/24] ^C
root@testmicrocloud:~# microcloud init
Waiting for services to start ...
Do you want to set up more than one cluster member? (yes/no) [default=yes]: no
Using address "10.0.0.2" for MicroCloud
Gathering system information ...
Would you like to set up local storage? (yes/no) [default=yes]: no
Would you like to set up distributed storage? (yes/no) [default=yes]: yes
Select from the available unpartitioned disks:

Select which disks to wipe:

Disk configuration does not meet recommendations for fault tolerance. At least 3 systems must supply disks.
Continuing with this configuration will inhibit MicroCloud's ability to retain data on system failure
Change disk selection? (yes/no) [default=yes]: no

 Using 1 disk(s) on "testmicrocloud" for remote storage pool

Do you want to encrypt the selected disks? (yes/no) [default=no]: no
Would you like to set up CephFS remote storage? (yes/no) [default=yes]: no
What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph internal traffic on? [default: 10.0.0.0/24] 10.0.0.0/24
What subnet (either IPv4 or IPv6 CIDR notation) would you like your Ceph public traffic on? [default: 10.0.0.0/24] 10.0.0.0/24
Configure distributed networking? (yes/no) [default=yes]: no
Initializing new services
 Local MicroCloud is ready
 Local MicroOVN is ready
 Local MicroCeph is ready
 Local LXD is ready
Awaiting cluster formation ...
Error: Failed to add disk to MicroCeph: failed to bootstrap OSD: Failed to run: ceph-osd --mkfs --no-mon-config -i 1: exit status 250 (2025-04-10T13:49:31.685+0000 79af3911b600 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-1/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
2025-04-10T13:49:31.685+0000 79af3911b600 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-1/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]                                                                                                                                       
2025-04-10T13:49:31.685+0000 79af3911b600 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-1/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]                                                                                                                                       
2025-04-10T13:49:31.728+0000 79af3911b600 -1 bdev(0x561d313b4000 /var/lib/ceph/osd/ceph-1/block) open open got: (16) Device or resource busy                                                                                                                                                                                                                                                                                                                        
2025-04-10T13:49:31.728+0000 79af3911b600 -1 bluestore(/var/lib/ceph/osd/ceph-1) mkfs failed, (16) Device or resource busy                                                                                                                                                                                                                                                                                                                                          
2025-04-10T13:49:31.728+0000 79af3911b600 -1 OSD::mkfs: ObjectStore::mkfs failed with error (16) Device or resource busy                                                                                                                                                                                                                                                                                                                                            
2025-04-10T13:49:31.728+0000 79af3911b600 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-1: (16) Device or resource busy)

I’ve double-checked the requirements, tried with answering yes to create local storage question. Not sure what I can do to debug this further?

Thanks!

What does snap list show inside your LXD VMs?

root@testmicrocloud:~# snap list
Name        Version                 Rev    Tracking       Publisher   Notes
core22      20250210                1802   latest/stable  canonical✓  base
core24      20241217                739    latest/stable  canonical✓  base
lxd         5.21.3-c5ae129          33110  5.21/stable    canonical✓  in-cohort
microceph   19.2.0+snapa0ec6e1371   1293   squid/stable   canonical✓  in-cohort
microcloud  2.1.0-3e8b183           1144   2/stable       canonical✓  in-cohort
microovn    24.03.2+snapa2c59c105b  667    24.03/stable   canonical✓  in-cohort
snapd       2.67.1                  23771  latest/stable  canonical✓  snapd

Hi, is this happening consistently, is there something else using the disks? You can try uninstalling the snaps and perform the init process again.

No, disks are dedicated to this machine, I tried to detach, delete, create a new disk and reattach, but no luck. I find a little bit puzzling the mpatha line in lsblk output, it doesn’t show up when attaching to other VMs, with no Microcloud installed

How did you attach those disks to the VMs?

I put the code snippet in the post:

lxc storage volume attach lxdpool01 testmicrocloud-vol02 testmicrocloud

Thanks, what storage driver are you using on the host for lxdpool01?

Those mpatha entries indeed look suspicious. Have you applied any other configuration than what is shown in the tutorial?

Thanks @jpelizaeus for the support.

The host is using ZFS as storage driver.

I was able to init microcloud when the vm is assigned just one block device.

However have a look at the following command sequence:

pmarini@rcasys-laptop:~$ lxc launch ubuntu:24.04 testmicrocloud --vm -c limits.cpu=4 -c limits.memory=8GiB
Launching testmicrocloud

pmarini@rcasys-laptop:~$ lxc storage volume create lxdpool01 testmicrocloud-vol01 size=10GiB --type block
Storage volume testmicrocloud-vol01 created

pmarini@rcasys-laptop:~$ lxc storage volume attach lxdpool01 testmicrocloud-vol01 testmicrocloud

pmarini@rcasys-laptop:~$ lxc exec testmicrocloud lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda       8:0    0   10G  0 disk 
├─sda1    8:1    0    9G  0 part /
├─sda14   8:14   0    4M  0 part 
├─sda15   8:15   0  106M  0 part /boot/efi
└─sda16 259:0    0  913M  0 part /boot
sdb       8:16   0   10G  0 disk 

pmarini@rcasys-laptop:~$ lxc storage volume create lxdpool01 testmicrocloud-vol02 size=10GiB --type block
Storage volume testmicrocloud-vol02 created

pmarini@rcasys-laptop:~$ lxc storage volume attach lxdpool01 testmicrocloud-vol02 testmicrocloud

pmarini@rcasys-laptop:~$ lxc exec testmicrocloud lsblk
NAME     MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sda        8:0    0   10G  0 disk  
├─sda1     8:1    0    9G  0 part  /
├─sda14    8:14   0    4M  0 part  
├─sda15    8:15   0  106M  0 part  /boot/efi
└─sda16  259:0    0  913M  0 part  /boot
sdb        8:16   0   10G  0 disk  
└─mpatha 252:0    0   10G  0 mpath 
sdc        8:32   0   10G  0 disk  
└─mpatha 252:0    0   10G  0 mpath 

It seems that the mpatha is created after attaching a second volume. Assuming this is unexpected I didn’t go on with the installation and initialization of microcloud, looking forward for your feedback.

Can you please also post the version of LXD used on the host?
I am failing to reproduce this using LXD latest/edge on my host machine.

$ snap list lxd
Name  Version      Rev    Tracking       Publisher   Notes
lxd   6.3-f22a395  33085  latest/stable  canonical✓  -

I can try to reproduce the issue on other LXD host I run, with LXD 5.21.

Reproduced on another host with 5.21.3-c5ae129

Mh, what base operating system do you use on your host?

In the first case Ubuntu desktop 24.04 and in the second Ubuntu Server 24.04. Kernel is 6.8.0-57-generic in both cases.

Do you have something configured inside /etc/multipath.conf on the testmicrocloud instance? I am trying to understand where this multipath stuff is coming from.

Yes:

root@testmicrocloud:~# cat /etc/multipath.conf 
defaults {
    user_friendly_names yes
}

I didn’t explicitly create the file.

Mh, that seems normal. Is there anything else under or inside any of the files in /etc/mutlipath/*?
By the way, which version of the ubuntu:24.04 image are you using? You can get the one used by the instance with lxc config show testmicrocloud | grep volatile.base_image and view the image details with lxc image show <id>.

root@testmicrocloud:~# tree /etc/multipath
/etc/multipath
├── bindings
└── wwids

1 directory, 2 files
root@testmicrocloud:~# cat /etc/multipath/bindings 
# Multipath bindings, Version : 1.0
# NOTE: this file is automatically maintained by the multipath program.
# You should not need to edit this file in normal circumstances.
#
# Format:
# alias wwid
#
mpatha 0QEMU_QEMU_HARDDISK_lxd_testmicrocloud--
root@testmicrocloud:~# cat /etc/multipath/wwids 
# Multipath wwids, Version : 1.0
# NOTE: This file is automatically maintained by multipath and multipathd.
# You should not need to edit this file in normal circumstances.
#
# Valid WWIDs:
/0QEMU_QEMU_HARDDISK_lxd_testmicrocloud--/
root@testmicrocloud:~# lsblk
NAME     MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sda        8:0    0   10G  0 disk  
├─sda1     8:1    0    9G  0 part  /
├─sda14    8:14   0    4M  0 part  
├─sda15    8:15   0  106M  0 part  /boot/efi
└─sda16  259:0    0  913M  0 part  /boot
sdb        8:16   0   10G  0 disk  
└─mpatha 252:0    0   10G  0 mpath 
sdc        8:32   0   10G  0 disk  
└─mpatha 252:0    0   10G  0 mpath

pmarini@rcasys-laptop:~$ lxc image show 97121cfbcd9e9f2b0d0e15f049ad2b45186cecb922a1351368754db20215915a
auto_update: true
properties:
  architecture: amd64
  description: ubuntu 24.04 LTS amd64 (release) (20250403)
  label: release
  os: ubuntu
  release: noble
  serial: "20250403"
  type: disk1.img
  version: "24.04"
public: false
expires_at: 2029-05-31T00:00:00Z
profiles:
- default

Do you have any multipath configuration on host side? Where is the disk coming from that you are using for the ZFS backed storage pool lxdpool01?