LXD Containers Not Removed in CI, Even with Force Flag

Hey folks,

We are encountering intermittent issues in our CI pipeline where LXD containers are not being removed, even when the --force flag is used. We are currently using lxd 5.21/stable.

Environment:

  • GitHub self-hosted runners
  • Ubuntu 20.04 and 22.04 (may also occur on 24.04)
  • Each test in our CI spins up 1-3 Kubernetes nodes in separate LXD containers. After testing, the containers are removed using:
    lxc rm <container-name> --force
    

Issue:
Most of the time, this process works as expected. However, in rare cases, the lxc rm --force command hangs indefinitely until the GitHub runner hits its timeout limit (6 hours). Here are some examples of the issue:

Additional Information:

Expected Behavior:
LXD containers should be reliably removed using lxc rm <container-name> --force without hanging.

Let me know if you need further information.

What is being run inside the container and what is its configuration lxc config show <instance> --expanded?

Please can you try getting output of journalctl and dmesg when it happens?