Access control list on zfs filesystem gets mangled after container export and import

I am using POSIX access control lists in a container on a zfs storage volume. I’ve exported the container and restored it on another host system, and on the restored container, the ACL doesn’t work as expected.

I cannot chdir into a directory as a specific user, www-data.

The zfs filesystem has acltype=posixacl on both host systems, and I’ve updated the kernel so that it is identical.

After restoring the container, the access control list for an affected file looked like this:

# file: repo/
# owner: admin
# group: users
user::rwx
user:4294967295:r-x
group::rwx
mask::rwx
other::r-x

That was obvious nonsense. On the original container, it looks like this:

$ getfacl repo/
# file: repo/
# owner: admin
# group: users
user::rwx
user:www-data:r-x
group::rwx
mask::rwx
other::r-x

So it looks like uid/gid confusion of the kind described in this zfs bug report from 2016: https://github.com/openzfs/zfs/issues/4177

Except that bug is supposed to be fixed and I’m using zfs 0.8.3, which already includes the patch.

I deleted the defective ACL entry (setfacl -x user:4294967295 repo) and recreated it with the correct uid (setfacl -m user:www-data:r-x repo), and it now looks like this:

$ getfacl repo/
# file: repo/
# owner: admin
# group: users
user::rwx
user:www-data:r-x
group::rwx
mask::rwx
other::r-x

But the error persists:

$ sudo su - www-data -s /bin/bash -c "cd /home/admin/repo"
-bash: line 1: cd: /home/admin/repo: Permission denied
$

On the source container, it works as expected:

$ sudo su - www-data -s /bin/bash -c "cd /home/admin/repo"
$

The zfs volume has posixacl support:

$ mount | grep acl
default/containers/c1-target on / type zfs (rw,relatime,xattr,posixacl)

On the source container, it looks like this:

$ mount | grep acl
main/containers/c1-source on / type zfs (rw,relatime,xattr,posixacl)

Here is the extended config for the source container:

$ lxc config show -e c1-source
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 22.04 LTS amd64 (release) (20231026)
  image.label: release
  image.os: ubuntu
  image.release: jammy
  image.serial: "20231026"
  image.type: squashfs
  image.version: "22.04"
  limits.cpu: "2"
  limits.memory: 2GB
  volatile.base_image: 187a9674b77056a0d466f5058ea72660cb52430dcdf06974ca8cd6c5a47fb6b3
  volatile.eth0.host_name: veth0b0807bf
  volatile.eth0.hwaddr: 00:16:3e:73:6d:b2
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 23c9e221-15bd-448d-8388-afbe65fe0629
  volatile.uuid.generation: 23c9e221-15bd-448d-8388-afbe65fe0629
devices:
  <network devices omitted>
  root:
    path: /
    pool: main
    type: disk
ephemeral: false
profiles:
- custom
stateful: false
description: ""

Here is the extended config for the target container:

$ lxc config show -e c1-target
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 22.04 LTS amd64 (release) (20231026)
  image.label: release
  image.os: ubuntu
  image.release: jammy
  image.serial: "20231026"
  image.type: squashfs
  image.version: "22.04"
  limits.cpu: "2"
  limits.memory: 2GB
  volatile.base_image: 187a9674b77056a0d466f5058ea72660cb52430dcdf06974ca8cd6c5a47fb6b3
  volatile.cloud-init.instance-id: e583dc84-7e19-4089-bbb2-e7f04326adab
  volatile.eth0.host_name: veth5b719c1a
  volatile.eth0.hwaddr: 00:16:3e:73:6d:b2
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: 23c9e221-15bd-448d-8388-afbe65fe0629
  volatile.uuid.generation: 23c9e221-15bd-448d-8388-afbe65fe0629
devices:
  <network devices omitted>
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- custom
stateful: false
description: ""

uname -a on source:

$ uname -a
Linux c1-source 5.4.0-186-generic #206-Ubuntu SMP Fri Apr 26 12:31:10 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

On target:

$ uname -a
Linux c1-target 5.4.0-186-generic #206-Ubuntu SMP Fri Apr 26 12:31:10 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

snap list lxd on source host:

$ snap list lxd
Name  Version      Rev    Tracking  Publisher   Notes
lxd   git-3217ea4  28855  5.0/edge  canonical✓  -

snap list lxd on target host:

$ snap list lxd
Name  Version       Rev    Tracking     Publisher   Notes
lxd   5.19-8635f82  26200  5.19/stable  canonical✓  -

zfs versions on both hosts are identical:

$ zfs --version
zfs-0.8.3-1ubuntu12.17
zfs-kmod-0.8.3-1ubuntu12.17

OS version on source host:

$ cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

OS version on target host:

$ cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

No ideas?

Is there any information missing that might help you suggest a course of action?

And do I need to file a bug?

Hi,

I’ve tried reproducing this issue, but without luck. Do you maybe have a reproducer?

Also, does copying an instance to another host result in the same issue (lxc copy instead of lxc export/import)?

No, I don’t have a reproducer, unfortunately.

How did you attempt to reproduce this? And did your source use lxd 5.0 and your target host lxd 5.19? (I can’t see how this would cause the problem, but apart from the hypervisor, it’s the only difference between the two cases.)

lxc copy would require that the target lxd host is listening, right? Unfortunately, I don’t have any network-attached daemons I can use to test this. My infrastructure is not designed to allow lxd migrations.

In the IRC channel somebody suggested that I re-export the container with --optimized-storage, but that feels like a wild guess. Why should that make any difference? And the container is huge (> 43 GB compressed). I’d rather fix the problem with the imported container and understand the cause.

We have a hint in the zfs bug that I linked to that it might be zfs-related, as I observed the uid shift exactly as described there.

I have now upgraded to zfs 2.2.4 using this ppa.

The problem persists.

What do I need to understand about lxc export (vs. --optimized-storage or lxc copy) that I am not getting? I don’t understand why a filesystem archive procedure should cause this problem, assuming that is the cause.

Hi, could you please file an issue?

Here is the reporducer:

#!/bin/sh

set -eu

srcChan=5.0/stable
dstChan=5.19/stable

install_lxd() {
    inst=$1
    chan=$2

    lxc exec "${inst}" -- snap remove lxd --purge
    lxc exec "${inst}" -- snap install lxd --channel="${chan}"

    lxc exec "${inst}" -- apt update
    lxc exec "${inst}" -- apt install -y zfsutils-linux

    lxc exec "${inst}" -- truncate -s 2G /root/zfsacl.img
    lxc exec "${inst}" -- zpool create -O acltype=posixacl -O xattr=sa zfsacl /root/zfsacl.img

    lxc exec "${inst}" -- lxd waitready
    lxc exec "${inst}" -- lxd init —-auto
    lxc exec "${inst}" -- lxc storage create zfsacl zfs source=zfsacl
}

cleanup() {
    lxc rm -f src dst || true
}

trap cleanup EXIT HUP INT TERM

# Create 2 VMs.
lxc launch ubuntu:20.04 src --vm
lxc launch ubuntu:20.04 dst --vm
sleep 30

# Configure LXD and ZFS pool.
install_lxd src "${srcChan}"
install_lxd dst "${dstChan}"

# Create a container and configure random acl.
lxc exec src -- lxc launch ubuntu:22.04 tmp -s zfsacl
sleep 10

lxc exec src -- lxc exec tmp -- apt update
lxc exec src -- lxc exec tmp -- apt install acl

lxc exec src -- lxc exec tmp -- useradd test
lxc exec src -- lxc exec tmp -- mkdir -p /secure/repo
lxc exec src -- lxc exec tmp -- chmod -R 700 /secure/repo
lxc exec src -- lxc exec tmp -- setfacl -m u:test:rwx /secure/repo
lxc exec src -- lxc exec tmp -- getfacl /secure/repo
# file: secure/repo
# owner: root
# group: root
#user::rwx
#user:test:rwx
#group::---
#mask::rwx
#other::--

# Export container and move it to another instance.
lxc exec src -- lxc export tmp
lxc exec src -- cat /root/tmp.tar.gz | lxc exec dst -- tee /root/tmp.tar.gz >/dev/null

# Import container and check acl.
lxc exec dst -- lxc import /root/tmp.tar.gz tmp
lxc exec dst -- lxc start tmp
sleep 10
lxc exec dst -- lxc exec tmp -- getfacl /secure/repo
# file: secure/repo
# owner: root
# group: root
#user::rwx
#user:4294967295:rwx
#group::---
#mask::rwx
#other::---

I have another piece of information.

Earlier, I neglected to verify the ACL attached to the parent directory /home/admin. Sure enough, it also has the mangled entry:

$ getfacl /home/admin
getfacl: Removing leading '/' from absolute path names
# file: home/admin
# owner: admin
# group: users
user::rwx
user:4294967295:r-x
group::r-x
mask::r-x
other::---

This is most probably the proximal cause of the Permission denied errors. It does not explain why the uid gets mangled in the first place, however.

Interestingly, I cannot simply add the www-data user to the ACL. If I attempt this, I get this output:

$ setfacl -m user:www-data:r-x /home/admin
setfacl: /home/admin: Malformed access ACL `user::rwx,user:www-data:r-x,user:4294967295:r-x,group::r-x,mask::r-x,other::---': Duplicate entries at entry 3

So something is seeing these two entries as equivalent, except when it comes to evaluating the ACL upon file access.

I am hesitant to remove the mangled entry as I want to preserve the faulty state for debugging purposes. Will a snapshot accomplish this?

Can I run the reproducer on a VM with an existing lxd installation safely, or do I need a sterile host system?

1 Like

Can I run the reproducer on a VM with an existing lxd installation safely, or do I need a sterile host system?

Yes, it will create two VMs (src and dst) without doing any changes on the host. Just make sure you don’t already have src/dst VMs as it will remove them.

@amikhalitsyn do you have any idea what could be causing the issues with UID being changed?
I’m guessing that the UID fallbacks to a large number because user is not found / properly mapped on target system?

Thanks.

How could the reproducer be modified to work on two different hosts? I suspect that may be a necessary condition.

It creates two VMs and installs LXD on each of them. Then on the first one (src) creates a container and configures ACL, exports it, transfers the tarball to another VM (dst), and finally imports it in the LXD on that other VM.

You can adjust it for two different hosts by simply running the commands prefixed with “lxc exec src” on one host and those prefixed with “lxc exec dst” on another.

Ah, I see. I overlooked that it was creating VMs. That might be sufficient. I will try it out.