Fail instance creation due to fingerprint issue (LXD 5.0.3)

I was trying to create a container instance today using images:alpine/3.19 and I got this error:

Error: Failed instance creation: Failed getting remote image info: Failed getting image: More than one match for the provided partial fingerprint "5112eb1849507214eb0f00052b891bb7e3c3722549c019870e2bfe7a1313fa0f"

I’m not sure what’s happening here. I tried creating a new cluster and was able to start the instance.

This is happening on my existing cluster. If anyone has any clue how to resolve this please let me know.

There is already a fix rolling out for this in latest/stable and 5.0/stable snap channels

https://github.com/canonical/lxd/pull/12829
https://github.com/canonical/lxd/pull/12834

Ah although you’re using the Linux Containers images remote which is in the process of phasing out support for LXD users.

Perhaps the fix above is once again causing problems with that image server due to support for multiple incus and lxd fields.

What does snap list show?

On the one where i cannot launch container this is what snap list shows:

lxd 5.0.3-9b310f5 26975 5.0/stable/… canonical✓ -

On one of my other cluster it shows this. On this one launching containers works fine even from images:

lxd 5.0.3-51452c3 26881 5.0/stable/… canonical✓ -

I’m guessing all I have to do is wait for the fix to rollout?

@tomp I understand it’s being phased out, but things were working before.

Is there anything I need to do to fix this?

Well yes and no.

The fix we landed to support the Linux containers new metadata for incus was causing problems for multi format ubuntu remotes.

This is why I reverted it yesterday.

But now this seems to be causing problems with the Linux containers remote.

I’ll see if that remote is still using duplicate metadata keys (for lxd and incus) and if I’ll drop incus support from 5.0 LTS series so the duplicate metadata entries don’t cause this issue.

I think as it stands the fix rolling out is causing this issue, but fixing another one.

1 Like

@tomp :pray: Thank you so much. We have customers running production apps on LXD and we don’t want things to break. Your work is much appreciated.

So i ran snap refresh on my test clusters and it’s basically broken now, with the new revision.

I can’t even boot an instance from the API. I hope this can be fixed soon, as this would basically cause us a lot of problems. Like our business would fail, because we rely on booting up containers.

This would be catastrophic for us.

@tomp we’re also getting our own image server up and running, so problems like this can be solved from our side in the future.

However given that we need time. We would greatly appreciate it if this problem is fixed for the time being.

Can you confirm its just the images: remote that is not working right?

Is ubuntu: working OK?

Have you tried doing a snap list lxd --all and then doing sudo snap revert lxd --revision {revision}

If you switch back to LXD 5.0.3 previous revision you should be able to launch from images: again until we figure this one out.

Yes

images: is broken
ubuntu: is working

I’ll try and revert the revision

1 Like

This should fix it

https://github.com/canonical/lxd/pull/12847

Will roll it out ASAP.

1 Like

We use automated clustering setup, so while doing manual revert can work on our own setup. It won’t work for customer’s setup who rely on automation.

Thank you so much! Really appreciate the prompt response to fixing this issue.

Yes I understand, sorry about this.

So this saga started about 5 months ago when we were informed that the images.linuxcontainers.org image server was going to drop support for lxd.tar.gz metadata file and instead start serving up an incus.tar.gz file in a very similar format that LXD could still consume. We were told it wasn’t going to happen immediately so it gave us time to land support for this in https://github.com/canonical/lxd/pull/12260 before it was removed.

The fix worked well for LXD consuming from images.linuxcontainers.org but started to cause problems with LXD consuming from other Canonical remotes that offered up both combined and uncombined image formats. This wasn’t observed on the ubuntu: remote but on an internal remote used for snap core image builds.

So with the release of LXD 5.0.3 people started noticing this problem and it was fixed in https://github.com/canonical/lxd/pull/12834 (it had been in LXD 5.18 onwards but wasn’t noticed).

However as we have now seen this has reintroduced the issue with consuming images: remote because its still serving up both LXD and Incus variants of the metadata file.

Since then we’ve learned that LXD users will be blocked from consuming the images.linuxcontainers.org server, and as of now LXD 5.20 cannot consume the images: remote anymore and so it has been removed entirely (along with support for Incus metadata files).

But as LXD 5.0 LTS can still consume the images: remote for a few more months the Incus fix was left in in case the images.linuxcontainers.org had (or was going to) stop providing the LXD metadata file.

But as it has not, ill revert that change too and restore it to just checking for the LXD metadata file (https://github.com/canonical/lxd/pull/12847).

Then from 2024/05/01 all versions of LXD will lose access to images.linuxcontainers.org entirely as per https://discuss.linuxcontainers.org/t/important-notice-for-lxd-users-image-server/18479

2 Likes

Thank you @tomp thank you for the rundown on the whole situation.

We’ve been up to date on the situation with images.linuxcontainers.org this is one reason why we’re also working on our own image server so we can be in control of our own destiny.

We also hope to make our image server customizable and available to the community. We understand that losing access to the image server has already started causing problems. I guess this thread is just one of those issues.

We look forward to being a part of the solution for the LXD / incus community.

2 Likes

The fix is building to go into 5.0/candidate now:

https://launchpad.net/~canonical-lxd/+snap/lxd-5.0-candidate

We can then test it and if happy rollout to 5.0/stable.

1 Like

How would I go about testing this?

I have a few test clusters i can try this with.

Once its in 5.0/candidate for amd64 ill let you know, then you can do snap refresh lxd --channel=5.0/candidate on each cluster member in your test cluster.

It will refresh on to revision 5.0.3-ffb17cf which includes the fix.

1 Like

5.0/candidate for amd64 now has 5.0.3-ffb17cf and is working for me:

snap refresh lxd --channel=5.0/stable --cohort="+"
2024-02-09T09:41:53Z INFO Waiting for "snap.lxd.daemon.service" to stop.
lxd (5.0/stable) 5.0.3-9b310f5 from Canonical✓ refreshed

lxc image list
+-------+-------------+--------+-------------+--------------+------+------+-------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCHITECTURE | TYPE | SIZE | UPLOAD DATE |
+-------+-------------+--------+-------------+--------------+------+------+-------------+

lxc launch images:ubuntu/jammy c1
Creating c1
Error: Failed instance creation: Failed getting remote image info: Failed getting image: More than one match for the provided partial fingerprint "1d5a5e1420abf1c7ec25a56a6c1d645bd4456d1dd0b19ba92eadd5fb62b4e1e8"

snap refresh lxd --channel=5.0/candidate --cohort="+"
2024-02-09T09:42:25Z INFO Waiting for "snap.lxd.daemon.service" to stop.
lxd (5.0/candidate) 5.0.3-ffb17cf from Canonical✓ refreshed
lxc launch images:ubuntu/jammy c1
Creating c1
Starting c1