Virtual Machine wont persist data on disk between reboots

Following the discussion here: Help launching aloha applicance with LXC

“Launching a virtual appliance” with lxd. The image is based on “Linux kernel version 5.10.0”

The image was successfully imported:

$ lxc image list
+-------------------+--------------+--------+---------------------------------------------+--------------+-----------------+-----------+-------------------------------+
|       ALIAS       | FINGERPRINT  | PUBLIC |                 DESCRIPTION                 | ARCHITECTURE |      TYPE       |   SIZE    |          UPLOAD DATE          |
+-------------------+--------------+--------+---------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| aloha             | af90f8991940 | no     |                                             | x86_64       | VIRTUAL-MACHINE | 25.95MiB  | Apr 12, 2024 at 9:59am (UTC)  |
+-------------------+--------------+--------+---------------------------------------------+--------------+-----------------+-----------+-------------------------------+
| juju/bionic/amd64 | d7a7c8040154 | no     | ubuntu 18.04 LTS amd64 (release) (20230525) | x86_64       | CONTAINER       | 215.31MiB | May 29, 2023 at 1:31pm (UTC)  |
+-------------------+--------------+--------+---------------------------------------------+--------------+-----------------+-----------+-------------------------------+

Launching the vm/applicance also works.

lxc launch aloha aloha1 --console=vga \
    --config security.secureboot=false \
    --config security.csm=true \
    --debug

The instance starts and actually works as intended.

lxc list 
+--------------------+---------+----------------------+-----------------------------------------------+-----------------+-----------+
|        NAME        |  STATE  |         IPV4         |                     IPV6                      |      TYPE       | SNAPSHOTS |
+--------------------+---------+----------------------+-----------------------------------------------+-----------------+-----------+
| aloha1             | RUNNING | 10.23.167.143 (eth0) | fd42:30f6:f5e1:f9e7:216:3eff:fe8d:8341 (eth0) | VIRTUAL-MACHINE | 0         |
+--------------------+---------+----------------------+-----------------------------------------------+-----------------+-----------+

However, all changes to disk are “reverted” / “resetted” back to its original state between reboots.

I have tried creating a file in the /var filesystem in the image, which is gone after reboot.

My question, is this due to my lxd setup or image or what should I be looking at to figure out how to persist changes to my running instances?

1 Like

Using virtual manager KVM we can make it work by choosing the correct OS Generic or unknown OS. If we chose something else, we get the same problem. So this might be a lead. Is there something equivalent in LXD? Can we specify OS when launching the image?

As @joakimnyman said. Perhaps this might also somehow be related to how and what the image is allowed to do on the disk/device side of things and perhaps modifying permissions/settings would allow us to have the appliance work as expected.

More specifically, a working qemu launch that works looks like this:

/usr/bin/qemu-system-x86_64 -name guest=aloha-albva-test -machine pc-i440fx-3.0,usb=off -enable-kvm -cpu qemu64 -m 512 -smp 2,sockets=2,cores=1,threads=1 -net user,hostfwd=tcp::$PORT-:22 -monitor none -kernel ${knl} -initrd ${ird} -drive file=“${CI_TESTS_DIR}/flash.img,format=raw” -drive file=${CI_TESTS_DIR}/ssd.img,format=raw -append “root=/dev/ram0 ro auto quiet panic=1 img=active console=ttyS0” -nographic -serial stdio -net nic

So as opposed to the command that LXD generates (which causes an error saving config for us):

/snap/lxd/27049/bin/qemu-system-x86_64 -S -name aloha-dev-1 -uuid 686ecbb2-3397-49f3-ac24-5a7e3959fd37 -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=allow,resourcecontrol=deny -readconfig /var/snap/lxd/common/lxd/logs/aloha-dev-1/qemu.conf -spice unix=on,disable-ticketing=on,addr=/var/snap/lxd/common/lxd/logs/aloha-dev-1/qemu.spice -pidfile /var/snap/lxd/common/lxd/logs/aloha-dev-1/qemu.pid -D /var/snap/lxd/common/lxd/logs/aloha-dev-1/qemu.log -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas lxd

E.g. We would like to figure out what makes the file-config fail for us with the LXD launch vs the QEMU launch above and if we can fix it without surgery.

What is the error you see?
Does the appliance support virtio scsi drivers?

@tomp we know that we got it to work with virt-install and that the command above mentioned works. Its when we use Generic or unknown OS as @joakimnyman mention above that it works. E.g. we would like to know how we can replicate that behavior with LXD.

Is the issue that LXD requires virtio scsi drivers for VMs?

I’m not really following how a system can boot and not persist its changes to disk whilst yet still being able to read from the disk.

Do you see anything in the syslog of the VM suggesting an issue writing?

@tomp its strange for us aswell. We can’t understand whats going on here.

We are investigating how we can mess with the qemu config for the instances but we would really appreciate some assistance here as its a major blocker for us how to get this working and not have to run a full virt-manager to compensate…

Basically, the image below shows what works vs not-works in terms of saving our config.

images-that-fails-and-works

We figure that we might be able to get it to work with lxd if we can understand what to manipulate in LXD…

I don’t know as don’t know what that item means.
I would suggest focussing on what errors are occurring inside the appliance that are preventing writes.

This is what the error looks like and it seems related to “mounting” a flash drive as part of a “config save” action.

This is indeed our conclusion after some further testing. It seems we can get it to work if we set the bus to SATA or IDE, which is why it worked when we used generic or unknown os.

We don’t believe the applicance supports virtio-scsi… BUT - it doesn’t make any sense to us, since the image seems to work “partially”. I mean, the vm shouldn’t be starting at all if this was completely unsupported…

Is there any workaround for this in LXD (while we try to convince the appliance maker - Haproxy - to support it)?

Try setting io.bus to nvme for the disk device.

https://documentation.ubuntu.com/lxd/en/latest/reference/devices_disk/#device-disk-device-conf:io.bus

Can you show the exact commands that is needed for this as I’m not sure I’ve done this correctly.

What I did was to import the image first:

lxc image import ./aloha-albva.tgz --alias aloha

Then I tried

lxc init aloha aloha-nvme --vm --config security.secureboot=false --config security.csm=true --device root,io.bus=nvme
Creating aloha-nvme

Then started the aloha-nvme instance.

… But no luck. The same error occurs.

Perhaps I’m doing it wrong. This is the config after start.

lxc config show aloha-nvme 
architecture: x86_64
config:
  security.csm: "true"
  security.secureboot: "false"
  volatile.base_image: af90f89919403257a5586eb6d479c19111047c68fa809f96b1ed6565eaf5ec8b
  volatile.cloud-init.instance-id: 6e391323-f975-4544-bca8-dc928f2dbd60
  volatile.eth0.host_name: tapbfd82e3f
  volatile.eth0.hwaddr: 00:16:3e:4d:bd:ac
  volatile.last_state.power: RUNNING
  volatile.uuid: c9430711-630d-4feb-bec8-1a3e8e2f84bc
  volatile.uuid.generation: c9430711-630d-4feb-bec8-1a3e8e2f84bc
  volatile.vsock_id: "3118060784"
devices:
  root:
    io.bus: nvme
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

For reference, this is a script which shows how the LXD image was created and imported to LXD.

#!/bin/bash

# Define variables
name="aloha-albva"
description="A KVM image for HProxy aloha."
disk_format="qcow2"
architecture="x86_64"
creation_date=$(date +%s)
source_type="qcow2"
source_url="./aloha-albva.img"

# Generate metadata.yaml
cat << EOF > metadata.yaml
name: $name
description: "$description"
disk_format: $disk_format
architecture: $architecture
creation_date: $creation_date
source:
  type: $source_type
  url: $source_url
EOF

echo "metadata.yaml generated successfully!"

# Create the image the way lxd wants it to be packed with the metadata.yaml
tar -czvf aloha-albva.tgz aloha-albva.img metadata.yaml

# Import the image to lxd
lxc image import ./aloha-albva.tgz --alias aloha

The image could then be used to launch new instances of aloha (once the image supports virtio-scsi or nvme drivers).

# Launch a container using the newly imported image.
lxc launch aloha aloha1 --console=vga \
    --config security.secureboot=false \
    --config security.csm=true \
    --debug

Thats correct.

Looks like it doesn’t support nvme either.

We have intention to cherry-pick virtio-blk support from Incus too.

But in general LXD doesn’t support legacy devices and only supports modern VMs.

This is from haproxy :

The ALOHA image comes only with a reduced set of Linux drivers. Please use SATA or IDE disk to let Linux attach the proper driver during the PCI bus probing. Same for Ethernet Driver

Once the image has been started, it runs in ram but somehow saves stuff to a /flash directory which is mounted as part of that “config save” command within the image.

The ethernet-driver mentioned in the comment seems to work.

SATA devices are legacy?

What we mean by that is that they emulate hardware, whereas virtio drivers are light weight pass-through devices.

You maybe able to attach a custom drive and specify its driver type using this approach:

https://documentation.ubuntu.com/lxd/en/latest/reference/instance_options/#instance-options-qemu

  • To specify custom virtual devices when VirtIO is not supported by the guest OS.

So basically, we had to revert from this since the appliance itself isn’t yet supporting modern device-drivers (virtio-scsi / nvme) and there seems to be no way to tell lxd to use SATA/IDE.

We are stuck in a bad place here.

So, our only way to move forward is to use qemu straight on the bone of the machine. Super awkward but I guess nothing we really can do about it.

We will have a discussion with haproxy about the situation, as their software is really what we like to be able to run. But that has little to do with the situation.

I would however be curious how future support in LXD will take into account that SATA still at least isn’t phased out / legacy.

This is how we start the image now.

sudo qemu-system-x86_64 -accel kvm -cpu host -smp 2 -m 1G -drive file=aloha1.img,format=qcow2,if=ide -net nic,model=virtio-net-pci -net bridge,br=lxdbr0 -vga std -no-user-config -nodefaults

Thanx for assisting us all this way though.