Enhancing our ZFS support on Ubuntu 19.10 - an introduction

I ran the desktop installer version of 19.10 w/ zfs until my system got over 60% full and my Samsung PM981 NVMe started taking several minutes just to do a simple ‘apt update’

One of the issues included UI freezes that were unrecoverable without hard reset

I started a thread on reddit on how to improve its performance once over half full, but I had since then abandoned it for a btrfs @ and @home w/ xfs /var partition layout - in case anyone’s interested here’s the thread, it got a fair amount of activity in /r/zfs:

https://www.reddit.com/r/zfs/comments/fhy2bq/can_someone_tell_me_why_single_drive_zfs_so_slow/

IMO zfs is not ready for prime time on Ubuntu. Here’s some of the most interesting fixes that were proposed to me that unfortunately I never tried (I still have my NVMe w/ zfs in an external enclosure, I might stick it back in my Thinkpad just so I can try some of this stuff):

For UI lockups I had to roll with the following in my /etc/modprobe.d/zfs.conf:

...
options zfs zfs_vdev_sync_write_min_active=2
options zfs zfs_vdev_sync_write_max_active=10
options zfs zfs_vdev_max_active=100
options zfs zfs_dirty_data_max=157286400

Oh, also I took fio tests of zfs w/ a few different record sizes and also xfs and btrfs on the same machine (albiet with a fresh copy of Ubuntu, sans my old home folder):

image

Another interesting thing is that, on paper, the zfs part looks fine. In practice it was excruciatingly slow.

Hoping some of the Ubuntu devs work some of these fixes into the installer, or give end users a comprehensive set of options for the zfs option in the installer for themselves.

1 Like

You got a lot of very bad advice (wild guesses) from semi-intellectuals on the subject. I run ZFS on a nvme-SSD Silicon Power of 512GB and I often run it up to 80% of occupation. I use the settings set during installation.

The amount of storage easily increases due to the weekly snapshots of up to 15 different VMs. I have used 19.10 on my desktop and still uses it on my laptop. I now use 20.04 on my desktop and on a second laptop. I never had anything close to what you did report. The hardware I use, is a Ryzen 3 2200G and 1st and 2nd generation Intel i5’s. I use Virtualbox for the VMs.

I have my VMs in a dataset separate from ROOT and USERDATA, mainly because I needed another setting related to dnodesize=legacy, because of send/receive compatibility with FreeBSD. Allmost all my Linux VMs boot within 10 seconds. I use zfs since Ubuntu 18.04 for data storage and started to boot since 19.04, basically copying an ext4 install to zfs. Since 19.10 I use the standard install of Ubuntu. During that time, I also used zfs on a Phenom II X4 without any issue.

On all my 4 systems, zfs has been rock solid.

How did you measure that 50% full? If you use Nautilus or standard CLI commands you run the chance that you missed the snapshot storage in the hidden folders .zfs, so the storage could have been >90% full and then the system slows down. I measure storage size using the zfs and zpool commands of zfs.

I’m not sure if I understand you correctly. I have a separate pool imported (and mounted), but it disappears each time I reboot the system. Is this the case you are talking about i.e. expected behavior?

Hey! No, it’s not intended and it’s a bug (due to a bug in openZFS itself: https://github.com/openzfs/zfs/issues/10160) and jibel and I are fixing it today (actually, testing is almost done).
Note that we will update zfs and grub for this. The grub part will come later (but before release) because we have other fixes for it.

My general comment was that you shouldn’t be need in general adding more pools as you can extend existing ones with other disks, but that’s unrelated to the fact that we needed to investigate and fix the bug, and we just did.

1 Like

Tanks for clarification and your work on this issue.

I’m waiting for the fix as LXD pools are also affected.

@mi-chu, @mfdes: Please watch once https://launchpad.net/ubuntu/+source/zfs-linux/0.8.3-1ubuntu9 is published in focal (currently in focal-proposed)

Thanks @didrocks, will check it out.

Cool, I downloaded Focal Fossa (develompent branch) and the installer installed on ZFS root very nicely.

I like zsysctl, grub integration - the potential is great.

One questin though - the 2.2G swap is not enough for suspending to disk. I didn’t find a bug for that, is this a known issue? Or is there an install-time work-around?

Thanks!

I think it is a known issue. I had also a hdd in the system and I defined a SWAP partition on the hdd and added that on a lower priority to fstab. So I have two swap partitions like this:
UUID=6d546509-476b-4dde-a70c-07231b2fef83 none swap discard,pri=2 0 0
UUID=c464cd6f-0f3b-4afe-873e-01e2438ed73a none swap sw,pri=1 0 0

The first one is the nvme-ssd as defined by the installer and the second one my hdd. If the swap is needed, it uses first the nvme ssd and for hibernation it would use the hdd. But to be honest I suspend and do not hibernate.

Any proposals for a workaround, like resizing a zpool volume and then adding to the existing swap or adding a new one.

I would like to use zfs on a laptop and want hibernation. It would seem a change to the installer to set the swap to equal RAM would be best, but time is too short for the release.

Gracias

Glad that you like zsysctl and our grub integration :slight_smile:

It’s been several releases that ubuntu doesn’t support officially anymore suspend to disk (too many cases of machines never resuming properly due to hardware configuration making it an unsupportable feature). This is why the SWAP >= ram size isn’t the default anymore for some cycles.
Note that we are using the default swap size from the ubiquity installer (the only difference is that for ext4 default installation, it’s a file)

However, you can tweak it manually in the installer:

  • go to a live session before starting ubiquity
  • open and edit (as root thus): /usr/share/ubiquity/zsys-setup
  • Look for:
        # Convert to MiB to align the size on the size of a block
        SWAPVOLSIZE=$(( SWAPSIZE / 1024 / 1024 ))
  • Just below this, override by setting SWAPVOLSIZE to the desired swap size as an integer (in MiB).

Note: this is where people can also change their default installations and pool options. Do this with care though as some feature can break your installation! (for instance, the bpool features have been carefully picked to not break booking from grub, including when picking a snapshot)

2 Likes

External pools are importing correctly on my machines, both virtual and bare-metal. Thanks for your help & guidance @didrocks!

1 Like

I am using LXD and created a dataset for LXD on the rpool root pool.

When you run sudo lxd init, you do the following. That is, I use rpool/LXD as the dataset name. I think it is nice to capitalize LXD, in line with the other datasets.

$ sudo lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: no
Do you want to configure a new storage pool? (yes/no) [default=yes]: yes
Name of the new storage pool [default=default]: default
Name of the storage backend to use (btrfs, dir, lvm, zfs, ceph) [default=zfs]: zfs
Create a new ZFS pool? (yes/no) [default=yes]: no
Name of the existing ZFS pool or dataset: rpool/LXD
...

In older versions of LXD such as LXD 3.0 you may need to create an empty dataset first, so that lxd init is able to use it. That is, sudo zfs create rpool/LXD.

Mind opening a bug against upstream LXD? I think they would be able to assess and address that easily.

I installed 19.10 with zfs support and it’s run like a champ. Of course, I didn’t really push it hard or tweak the bpool and rpool. I added a second disk to the rpool but that’s it.

I would like to upgrade 19.10 to 20.04 using do-release-upgrade -d. In the past I’ve made an image copy of the system disk beforehand. Should I do that here? My understanding is zfs + zsys make snapshots before installing packages, making rollback possible. Does that also apply for upgrades like this? If the upgrade fails (say the graphics driver doesn’t work, a common event), how do I “roll back”?

I’ve installed zsys but I don’t have the client zsysctl. Is this expected?

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  zsys           0.2.2+19.04  amd64        ZFS SYStem integration

I understand that sometimes things get harder before they get easier and that includes “boot on zfs”. But between boot on zfs, snappy and now systemd-homed https://systemd.io/HOME_DIRECTORY/, looking at a file’s pathname and the file’s contents has become a forensics exercise. Then there’s automounting. Or cloud files (gdrive, s3, onedrive). My three ubuntu boxes each have +160 mount points, 100+ of those are snaps. My fc32 box has 60 mount points, half of them snaps. Yeah, yeah, the dinosaur roars. But engineering is the art of managing complexity. Nothing’s getting simpler right now.

Once zfs is “baked in”, I think it will help enormously. I’d love to install something and be able to recover from the automatic snapshot if things don’t go according to plan. I spend a lot of time preparing for disaster before doing something, which while certainly educational, is time consuming. So push ahead please.

2 Likes

This is one of our goal, as I answered privately, we did some tweaking between 19.10 and 20.04 in the default layout and options. If you have the chance, it’s better to reinstall 20.04 with ZFS on root instead of upgrading. Upgades are supported, but you will not benefit from the fine tuning we have done (this is only set by the installer).
For the same reason, reverting between release is one of our goal, and from 20.04, you can consider that you will be able to robusly revert from any release you upgrade to (like 20.10 → 20.04 revert or 22.04 → 20.04). It’s not considered supported between 20.04 → 19.10.

zsysctl client is available starting 20.04, so it’s normal you don’t see it in 19.10 :slight_smile:

And thanks for your kind words!

1 Like

Is there a way to run the fine tuning manually after the upgrade? I’ve installed a bunch-o-packages and configured ${HOME}**. This isn’t earth shattering, I can redo it, but it is time-consuming. If I can follow a manual recipe to fine tune, I’ll go that route.

If I do a complete re-install, I may create a separate pool for /home in anticipating of systemd-homed. That’s off topic.

The kind words are (currently :-)) deserved. Every upgrade to fedora from 29 up to 32 effectively disables zfs until it “catches up”. I’m still waiting on fc32. Having a distro forge ahead on this helps enormously with zfs adoption, licensing concerns not withstanding. This is (one of) Canonical’s “value propositions” (says the ex-Canonical guy) and should be applauded for it’s vision. Nice job.

1 Like

For the fune tuning, some changes involve upgrading the bpool to version 5000 but disabling some features. You need also to do that for rpool and align with our defaults. You can see them for both rpool/bpool on zsys-setup « scripts - ubuntu/+source/ubiquity - [no description]. Warning: -O sync=disabled shouldn’t be put on finale dataset, this is only to speed up installation (we reset it to standard after the installation: zsys-setup « scripts - ubuntu/+source/ubiquity - [no description])

There are other changes, like we always have an ESP partition now since 20.04, even if you booted in legacy mode. We install grub there instead of a dedicated partition and bindmount /boot/grub from /boot/efi/grub (more details in the script I pasted above).

I think that should do most it, but it means playing with partitions and such. Ensure you boot once as zsys will migrate some tags from the older scheme to the newer ones :slight_smile:

2 Likes

So I decided to take your advice and do a fresh 20.04 install after an image copy of a disk. I think this puts me on a solid foundation. I’m now left with the task of reinstalling all the (manual?) packages from the previous installation/image/disk, adding users explicitly when packages don’t, copying the relevant parts of /home/mcarifio/** and generally adding/changing local configurations. @didrocks I seem to remember you were working on a tool that would make one Ubuntu machine similar to another maybe https://wiki.ubuntu.com/OneConf? Do I remember this wrong? Is there a tool/script that does this kind of mirroring of an installation (other than low level backups). Please advise. Thanks.