Rename LXD QEMU VM ring buffer to com.canonical.lxd

Project LXD
Status Implemented
Author(s) @dinmusic
Approver(s) @tomp
Release 5.20

Abstract

Rename name of the virtual machine ring buffer from org.linuxcontainer.lxd to com.canonical.lxd

Rationale

As part of the LXD project migration from linuxcontainers.org to canonical.com, we also need to update the ring buffer name in virtual machines.

Specification

Design

Background

Each virtual machine (VM) hosts a LXD agent, which is lightweight binary file that allows monitoring and modification of VM by the LXD server. When LXD agent is started, it writes a VM status (Running/Stopped) to the ring buffer.

A ring buffer is a data structure used for storing temporary data. In this context, it’s linked to a virtual serial device in the VM. When the LXD agent starts or stops, it records its status to the ring buffer through this virtual serial device. The current name for this device is org.linuxcontainers.lxd. The ring buffer collects the data from this device and holds it until LXD server reads from it.

Compatibility

A new virtual serial device named com.canonical.lxd is being introduced. With this change, the ring buffer will connect to this new device instead of the old one (org.linuxcontainers.lxd). Due to this change, existing VMs that are still running an older version of the LXD agent will continue to send status updates to the old device. Meanwhile, the LXD server will be looking for these updates in the ring buffer connected to the new device, as it is depicted on the following figure:

What Happens with Existing Virtual Machines?

Existing VMs will remain completely functional. Once LXD is upgraded, it will still be able to use the old serial device which is connected to the ring buffer. Once the VM is restarted, QEMU will be reconfigured with the new serial device and a new lxd-agent process that will detect and use it.

Limitations

Stateful Snapshots

Renaming a ring buffer introduces a breaking change with stateful VM snapshots. VMs can be restored from those snapshots normally as long as their state is not restored (stateless restoration). Restoring VM’s state will not be possible because LXD server will attach a new serial device to the VM, which is not recorded by the previous VM state.

$ lxc launch ubuntu:22.04 v1 --vm
$ lxc snapshot v1 mysnap --stateful

# >> Upgrade LXD

# This will work, because VM state is not recovered.
$ lxc restore v1 mysnap
$ lxc start v1

# This will NOT work because new serial device is attached to the VM 
# (that is not included in the previous VM state).
$ lxc restore v1 mysnap --stateful
Error: Failed restoring state from "/var/snap/lxd/common/lxd/virtual-machines/v1/state": Monitor is disconnected

$ lxc info v1 --show-log
...
qemu-system-x86_64: Failed to load virtio-console:virtio
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:01.0:00.5/virtio-console'
qemu-system-x86_64: load of migration failed: Invalid argument

Daemon Changes

LXD Server

A new virtual serial device named com.canonical.lxd is introduced and is connected to ring buffer that is currently (prior this change) being linked with a serial device named org.linuxcontainers.lxd.

LXD Agent

Agent is updated to connect to a serial device named com.canonical.lxd instead of serail device named org.linuxcontainers.lxd.

API changes

No API changes expected.

CLI changes

No CLI changes expected.

@dinmusic could something be done to remove this requirement?
If the VM hasn’t been restarted, then the QEMU process is still configured to use the old device name and associated ring buffer, even if LXD itself has been upgraded to the newer version.

Could we get LXD to decide which ring buffer to use to read/write from, or even better, keep using the original one for both new and old VMs, so that upgrading LXD doesn’t cause running VMs to break.

Associated PR:

https://github.com/canonical/lxd/pull/12548

I’ve confirmed that because the ring buffer name on the QEMU internal side hasn’t changed, that if you upgrade LXD while a guest VM is still running, LXD will still be able to use the old serial device as it is still connected to the ring buffer. And that when that VM is restarted QEMU will be reconfigured to have the new serial device and a new lxd-agent process that will detect and use it.

1 Like

Thanks @tomp. I’ve updated the spec to reflect your confirmation.

1 Like