Error: Failed to retrieve PID of executing child process [new bug]

Hi,
Today on my both Debian 12 hosts servers I get:
Error: Failed to retrieve PID of executing child process
when I try to enter any container.

LXD is in version:
lxd 5.21.1-2d13beb 28463 latest/stable canonical✓ -

What happened that it got broken ?

Looks like its a new bug :frowning:

Because I have third server on Debian 12 and there its working but LXD version is little bit older:
lxd 5.21.1-d46c406 28460 5.21/stable canonical✓ -

Most likely some scheduled process on your system cleaned the LXD logs directory containing the lxc.conf file for those containers.

See Error: Failed to retrieve PID of executing child process · Issue #12084 · canonical/lxd · GitHub

No. Every directory of container has lxc.conf:

root@f6~# ls -l /var/snap/lxd/common/lxd/logs/martel/
total 8
-rw-r–r-- 1 root root 0 Jul 2 11:53 forkexec.log
-rw-r----- 1 root root 2447 Jun 27 22:51 lxc.conf
-rw-r----- 1 root root 437 Jul 2 11:53 lxc.log

root@f6~# ls -l /var/snap/lxd/common/lxd/logs/ftp-znak/
total 16
-rw------- 1 root root 5909 Jun 30 10:06 console.log
-rw-r–r-- 1 root root 0 Jun 30 10:08 forkexec.log
-rw-r–r-- 1 root root 0 Jun 30 10:06 forkstart.log
-rw-r----- 1 root root 2339 Jun 30 10:06 lxc.conf
-rw-r----- 1 root root 3678 Jul 2 09:57 lxc.log
-rw-r----- 1 root root 0 Jun 30 10:06 lxc.log.old
root@f6~#

Does restarting one of the containers fix it?

Also it could be tmp files being cleaned, see

Yes,
lxc restart container_name
fixes the issue.

So now I have to plan rebooting all containers or whole host server in maintenance window - LXD shouldn’t work like that. I haven’t made any upgrades on host server and I have no idea which process deleted some tmp files which affected LXD - that shouldn’t happened.

LXD should be resistant for this.

And by the way: how is possible that it happened only on LXD in newest version ?

Did you see my last post, it could also be /tmp/snap-private-tmp getting cleaned up.

I agree, but as its something external to LXD doing this, the way forward is not entirely clear at this point (hence the earlier post on the snapcraft forum).

You can try the workaround mentioned in the snap post:

Workaround: At the top of /usr/lib/tmpfiles.d/snapd.conf I added: x /tmp/snap-private-tmp/snap.lxd

Hopefully that will stop it happening again.

We will get Instance: Re-generate missing container lxc.conf during Exec (from Incus) by tomponline · Pull Request #13697 · canonical/lxd · GitHub into LXD 6.1 also (but we’ve ascertained this isn’t the issue in your case).