Update [2025] on Docker inside LXD container with ZFS storage with GPU passthrough

Hey everyone

I realize that there has been some issues with docker on ZFS in the past. I just wanted to post a thread here incase anyone is searching the web for information on this.

It seems that with time ZFS now supports overlay2. I launched a container with the following profile settings

config:
  security.nesting: "true"
  security.syscalls.intercept.mknod: "true"
  security.syscalls.intercept.setxattr: "true"

Installed docker and ran a container. Everything worked, everything is fast.

here is docker info

Client:
 Version:    26.1.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 26.1.5
 Storage Driver: overlay2
  Backing Filesystem: zfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3a4de459a68952ffb703bbe7f2290861a75b6b67
 runc version: 2c9f5602f0ba3d9da1c2596322dfc4e156844890
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.0-51-generic
 Operating System: Alpine Linux v3.20
 OSType: linux
 Architecture: x86_64
 CPUs: 36
 Total Memory: 62.7GiB
 Name: test-docker
 ID: 31845754-aa9e-440b-b255-0cfd6a8816f7
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Here is what I ran

docker run -it --rm -d -p 8080:80 --name web nginx

I was also able to access the page served by the container from using curl -v http://lxd-container-ip:8080.

This is a very good out of the box ‘just works’ experience.

I hope this post can give an updated perspective on this issue. I also think Canonical should update the documentation here

This also opens up many opportunities. Imagine treating each of your lxd container as a ‘pod’ where you run multiple docker containers for a single app. This pretty much allows LXD to compete directly with K8s, but much more simple / lightweight / performant.

3 Likes

great news, I’ve been creating containers which need docker as block storage (i.e. zvols) and formatting witrh ext4 or btrfs, which is super cumbersome for many reasons, but maybe I can stop now.

What version of zfs did you test with?

lxc info is returning the following:

  storage_supported_drivers:
  - name: zfs
    version: 2.2.2-0ubuntu9.1
    remote: false

I’m using lxd 5.21.2 via snap.

I’m happy to report that GPU passthrough also work out of the box.

I was able to run a podman container inside LXD container and have it successfully query nvidia-smi via Nvidia Container Toolkit

root@test-cuda-podman:~# podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi
Mon Jan 20 23:59:39 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02              Driver Version: 555.42.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro M2000                   Off |   00000000:03:00.0 Off |                  N/A |
| 56%   42C    P8             12W /   75W |       5MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Honestly right now I’m in shock at how simple this is. I didn’t have to do anything everything just worked out of the box. No more hacks / config etc…

I switched from docker to podman since it’s simpler and has support for Nvidia CDI protocol out of the box.

3 Likes

this page from incus suggest that you need zfs.delegate for overlay2 to work? Did you set this or it got automagically set somehow?

can someone shed some light on this?

Oh I didn’t know about the zfs.delegate option It just worked automagically somehow.