Unable to delete VMs after crash

I have a 2 server LXD cluster running 5.21LTS, named hydra1 and hydra2. The hydra2 server had -some- event and crashed while I was using some automated deployments. After rebooting the server (I don’t recall if I was able to shutdown cleanly or not), I have two VMs that are stopped and cannot be deleted. Uncertain if there is a magic incantation for this?

Note that the two stopped VMs are on hydra2:

$ lxc list --project hydradev
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
|                  NAME                   |  STATE  |       IPV4        | IPV6 |      TYPE       | SNAPSHOTS | LOCATION |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-177ca778-2438-4bdf-6c42-f6c809d0614b | RUNNING | 10.0.4.104 (eth0) |      | VIRTUAL-MACHINE | 0         | hydra1   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-499dd539-ac4d-4f98-5051-1aaf1f855e16 | RUNNING | 10.0.5.4 (eth0)   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-dfaca061-1f86-4dea-48bb-83c750181681 | STOPPED |                   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-e1ab555b-0aec-4167-4f48-a4d425014f72 | STOPPED |                   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+

The VMs cannot be deleted (the -f is for good measure :wink: ):

$ lxc rm --project hydradev -f vm-dfaca061-1f86-4dea-48bb-83c750181681
Error: Failed deleting instance "vm-dfaca061-1f86-4dea-48bb-83c750181681" in project "hydradev": Failed getting instance pool: Instance storage pool not found
$ lxc rm --project hydradev -f vm-e1ab555b-0aec-4167-4f48-a4d425014f72
Error: Failed deleting instance "vm-e1ab555b-0aec-4167-4f48-a4d425014f72" in project "hydradev": Failed getting instance pool: Instance storage pool not found

Based on the “instance pool” message, I do note that neither of these VMs have a root (“/”) volume!

$ lxc storage volume list --project hydradev | grep "vm-"
| local | virtual-machine | vm-177ca778-2438-4bdf-6c42-f6c809d0614b                          |                 | block        | 1       | hydra1   |
| local | virtual-machine | vm-499dd539-ac4d-4f98-5051-1aaf1f855e16                          |                 | block        | 1       | hydra2   |

Note that the only VMs in the storage pool are the two running.

Any pointers would be great!

Thanks!

Hello, could you please show the output of
lxc config show vm-dfaca061-1f86-4dea-48bb-83c750181681 --expanded ?

Also, what type of storage are you using (zfs/dir)?

Sure! –

$ lxc config show vm-dfaca061-1f86-4dea-48bb-83c750181681 --expanded --project hydradev
architecture: x86_64
config:
  image.architecture: x86_64
  image.description: bosh-openstack-kvm-ubuntu-jammy-go_agent-1.1016
  image.os: Ubuntu
  image.root_device_name: /
  image.root_disk_size: 5120MiB
  limits.cpu: "2"
  limits.memory: 2048MiB
  raw.qemu: -bios bios-256k.bin
  volatile.base_image: afca4a3f1fa2b7374865195a7847debcc5bf715b79d76e05cea6d68c0de74787
  volatile.cloud-init.instance-id: 57b4a2c9-a21e-4e98-a3e4-631b9667fc55
  volatile.eth0.hwaddr: 00:16:3e:76:2e:85
  volatile.uuid: ce43bb12-f0f7-4e95-94c0-476f07ac8f3f
  volatile.uuid.generation: ce43bb12-f0f7-4e95-94c0-476f07ac8f3f
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: fanbr1
    type: nic
  root:
    path: /
    pool: local
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

Storage is btrfs (I used ZFS for a while, but it cause odd issues with … I forget, locking or something):

$ lxc storage list
+-------+--------+-------------+---------+---------+
| NAME  | DRIVER | DESCRIPTION | USED BY |  STATE  |
+-------+--------+-------------+---------+---------+
| local | btrfs  |             | 113     | CREATED |
+-------+--------+-------------+---------+---------+

And, don’t think this matters, but networking is the Ubuntu Fan:

$ lxc network show fanbr1
name: fanbr1
description: ""
type: bridge
managed: true
status: Created
config:
  bridge.mode: fan
  fan.overlay_subnet: 10.0.0.0/16
  fan.underlay_subnet: 192.168.1.0/24
  ipv4.nat: "true"
used_by:
- /1.0/instances/vm-dfaca061-1f86-4dea-48bb-83c750181681?project=hydradev
<snip>
locations:
- hydra1
- hydra2
project: default

It looks like you have a missing storage volume record for the instance, that error comes from this function:

I suggest that you should re-create the missing storage volume row using the lxd sql global command.

E.g.

lxd sql global 'select * from storage_volumes'
+-----+------------------------------------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
| id  |                               name                               | storage_pool_id | node_id | type | description | project_id | content_type |         creation_date          |
+-----+------------------------------------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
| 608 | c1                                                               | 1               | 1       | 0    |             | 2          | 0            | 2026-01-26T10:43:47.971162478Z |
+-----+------------------------------------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+

Then using lxd sql global ‘insert into …’ to insert a row that has a volume name that matches the instance name.

Ok, tried that and new error. TLDR: I wonder if I’m going to have to do surgery and delete the records instead?

What I did…

  1. Was stumped until I realized I needed to be on the actual server (missed/didn’t catch the ‘lxd’ command was there). So ssh’d to hydra1 (habit, assumed the database is shared, so I think it’s ok).

  2. $ lxd sql global ‘select * from storage_volumes’ was really big, so…

$ lxc list --project hydradev
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
|                  NAME                   |  STATE  |       IPV4        | IPV6 |      TYPE       | SNAPSHOTS | LOCATION |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-177ca778-2438-4bdf-6c42-f6c809d0614b | RUNNING | 10.0.4.104 (eth0) |      | VIRTUAL-MACHINE | 0         | hydra1   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-499dd539-ac4d-4f98-5051-1aaf1f855e16 | RUNNING | 10.0.5.4 (eth0)   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-dfaca061-1f86-4dea-48bb-83c750181681 | STOPPED |                   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-e1ab555b-0aec-4167-4f48-a4d425014f72 | STOPPED |                   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+

… and …

$ lxd sql global 'select * from storage_volumes where name in ("vm-177ca778-2438-4bdf-6c42-f6c809d0614b", "vm-499dd539-ac4d-4f98-5051-1aaf1f855e16", "vm-dfaca061-1f86-4dea-48bb-83c750181681", "vm-e1ab555b-0aec-4167-4f48-a4d425014f72")'
+------+-----------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
|  id  |                  name                   | storage_pool_id | node_id | type | description | project_id | content_type |         creation_date          |
+------+-----------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
| 2116 | vm-177ca778-2438-4bdf-6c42-f6c809d0614b | 1               | 1       | 3    |             | 6          | 1            | 2025-05-08T21:55:10.063126096Z |
| 2596 | vm-499dd539-ac4d-4f98-5051-1aaf1f855e16 | 1               | 2       | 3    |             | 6          | 1            | 2026-01-17T21:08:17.484696575Z |
+------+-----------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+

which narrowed the list down – and to what I expected from you comments.

  1. Had to figure out meaning of node_id (etc). Found it’s the cluster nodes. 1 = hydra1, 2 = hydra2 (very convenient).

  2. Inserted rows…

$ lxd sql global 'insert into storage_volumes (name, storage_pool_id, node_id, type, description, project_id, content_type) values ("vm-dfaca061-1f86-4dea-48bb-83c750181681", 1, 2, 3, "delete me", 6, 1)'
Rows affected: 1
$ lxd sql global 'insert into storage_volumes (name, storage_pool_id, node_id, type, description, project_id, content_type) values ("vm-e1ab555b-0aec-4167-4f48-a4d425014f72", 1, 2, 3, "delete me", 6, 1)'
Rows affected: 1
$ lxd sql global 'select * from storage_volumes where name in ("vm-177ca778-2438-4bdf-6c42-f6c809d0614b", "vm-499dd539-ac4d-4f98-5051-1aaf1f855e16", "vm-dfaca061-1f86-4dea-48bb-83c750181681", "vm-e1ab555b-0aec-4167-4f48-a4d425014f72")'
+------+-----------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
|  id  |                  name                   | storage_pool_id | node_id | type | description | project_id | content_type |         creation_date          |
+------+-----------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
| 2116 | vm-177ca778-2438-4bdf-6c42-f6c809d0614b | 1               | 1       | 3    |             | 6          | 1            | 2025-05-08T21:55:10.063126096Z |
| 2596 | vm-499dd539-ac4d-4f98-5051-1aaf1f855e16 | 1               | 2       | 3    |             | 6          | 1            | 2026-01-17T21:08:17.484696575Z |
| 2902 | vm-dfaca061-1f86-4dea-48bb-83c750181681 | 1               | 2       | 3    | delete me   | 6          | 1            | 0001-01-01T00:00:00Z           |
| 2903 | vm-e1ab555b-0aec-4167-4f48-a4d425014f72 | 1               | 2       | 3    | delete me   | 6          | 1            | 0001-01-01T00:00:00Z           |
+------+-----------------------------------------+-----------------+---------+------+-------------+------------+--------------+--------------------------------+
  1. … and yuck!
$ lxc list --project hydradev
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
|                  NAME                   |  STATE  |       IPV4        | IPV6 |      TYPE       | SNAPSHOTS | LOCATION |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-177ca778-2438-4bdf-6c42-f6c809d0614b | RUNNING | 10.0.4.104 (eth0) |      | VIRTUAL-MACHINE | 0         | hydra1   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-499dd539-ac4d-4f98-5051-1aaf1f855e16 | RUNNING | 10.0.5.4 (eth0)   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-dfaca061-1f86-4dea-48bb-83c750181681 | STOPPED |                   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
| vm-e1ab555b-0aec-4167-4f48-a4d425014f72 | STOPPED |                   |      | VIRTUAL-MACHINE | 0         | hydra2   |
+-----------------------------------------+---------+-------------------+------+-----------------+-----------+----------+
$ lxc rm -f vm-dfaca061-1f86-4dea-48bb-83c750181681
Error: Failed checking instance exists "local:vm-dfaca061-1f86-4dea-48bb-83c750181681": Instance not found
$ lxc rm -f vm-e1ab555b-0aec-4167-4f48-a4d425014f72
Error: Failed checking instance exists "local:vm-e1ab555b-0aec-4167-4f48-a4d425014f72": Instance not found

I assume “instance not found” == virtual machine isn’t running. Which we knew! So, that’s where I am starting to wonder if I need to delete. With any luck, the relationships cascade. If not, I’ll likely need a little bit of guidance as to what to clean up.

One thing I noticed is that your lxc rm -f commands are missing the --project hydradev flag. Does the error show up even when you run these commands with the flag?

I had set boshdev as the default project (which isn’t obvious).

So…

$ lxc rm -f vm-dfaca061-1f86-4dea-48bb-83c750181681
Error: Failed checking instance exists "hydra1:vm-dfaca061-1f86-4dea-48bb-83c750181681": Instance not found
$ lxc rm -f vm-dfaca061-1f86-4dea-48bb-83c750181681 --project boshdev
Error: Failed checking instance exists "hydra1:vm-dfaca061-1f86-4dea-48bb-83c750181681": Instance not found

Just for confirmation.