Enhancing our ZFS support on Ubuntu 19.10 - an introduction

That’s odd, my USERDATA dataset (and I mean the literal USERDATA one, which acts as a container dataset) has canmount=off. I experienced some odd behavior regarding this, as I set this property to on in my dataset but it kept being switched to noauto after a reboot. I filed #1849179 about this.

Yes, you need to have “ZSYS” apt install zsys to get your annotations on any new user (contrary to default one hardcoded in the installer). Those will be needed once we ship it by default, as zsys will need those annotations to be able to rollback system data, or system + userdata to make those relationship.

Note that you should change the prefix from org.zsys to com.ubuntu.zsys (finale 19.10 release has com.ubuntu.zsys, earlier had org.zsys).

That should work by default though (I tested this locally a week ago, I’ll retry). But any locally imported pools should be reimported on reboot.

I had my zfs user account script adding canmount=noauto to user zfs pools, which made them not come up at all at boot. And I couldn’t get a revert to canmount=on to stick. I ended up reinstalling and just not setting canmount at all on the USERDATA zpools I was creating.

I figured I was messing around with zfs systemd services and must have messed something up myself.

Disabling my zpool-local service (which had imported my local non-bpool/rpool zpools) and rebooting resulted in none of my local non-bpool/rpool zpools importing at boot. I have zsys installed.

Do you mind opening a bug against zsys please (https://github.com/ubuntu/zsys)? I’ll give it another try.

Bug opened here: https://bugs.launchpad.net/ubuntu/+source/zsys/+bug/1849522

1 Like

How to do ZFS striping and mirroring in the new ZFS on Root configuration?

Sorry if I ask something already answered somewhere, but I am wondering how should I plan out SSDs, datasets (or partitions with the old terminology). I would like to create an Ubuntu-based PC system with KVM Qemu virtual windows. I am considering to have two Samsung 970 EVO Pro 1TB NVMe M.2 in a ZFS striping configuration (similar to Raid0). I was reading that in striping configuration, multiple identical disks vdev is set up. ZFS will divide equally and send data to all vdevs parallel, halving the transfer time with two identical ZFS formatted SSD configurations. (I would use daily backup onto a SATA drive as well.) In the future, I may replace the two 1 TB striping configuration with 2 times 2 TB mirror configuration. I believe these type of configurations require two identical SSDs and ZFS formatting for the entire SSDs. I would also use a third SSD (500 or 1 TB) on its own for Windows virtual machine. Of course, I would like to keep the configuration zsys compatible.

The way I understand ZFS on root uses other partitions (ext4 partition along with ZFS). If this ext4 is another partition along with ZFS partition, would this prohibit identical disk striping or mirroring., should the root partition and other partition for virtual windows be set on the single third SSD along with the virtual machine and configure the program file and user data directories as datasets for dual striping or mirror configuration on the separate two SSDs? If the ex4 is on the top of an ZFS volume, then I would think I could take the advantage of ZFS striping, mirroring or disk adding features with ZFS commands as discussed in ZFS documentation but I do not know if this is the case.

This is great stuff.

It appears that uefi is required to do the installation? It wasn’t obvious to me in the directions and I’ve tried to avoid it if I can. You might want to highlight this. The installation failed several times until I changed the bios setting.

Thanks for trying it out!

This shouldn’t be the case or you have another issue which isn’t related to uefi (we don’t do anything with/without uefi). Maybe worth trying with ext4 and reporting some bugs on ubiquity on launchpad?

Not exactly sure I understand the “try with ext4” suggestion. I did a fresh 19.10 installation and allowed the installation to partition /dev/sda. Here’s what I get:

blkid /dev/sda*
/dev/sda1: UUID=“E161-F45D” TYPE=“vfat” PARTLABEL=“EFI System Partition” PARTUUID=“e5ae8c8e-375f-4e98-b753-7b2ee53ff1a7”
/dev/sda2: UUID=“abc72cdc-7523-4c6c-ac3a-fe399fed605d” TYPE=“ext4” PARTUUID=“1aa9d84b-4f93-a142-a1ea-e81448580899”
/dev/sda3: UUID=“6235381e-1357-4e72-96d3-d7ee880c3f85” TYPE=“swap” PARTUUID=“f4539ff9-394d-5047-8469-f6e3defe25e5”
/dev/sda4: LABEL=“bpool” UUID=“15906117851312693971” UUID_SUB=“9693698414459531293” TYPE=“zfs_member” PARTUUID=“b2430b03-8c77-f846-8643-be1df0629f9b”
/dev/sda5: LABEL=“rpool” UUID=“9765444499827069448” UUID_SUB=“8584116550946854135” TYPE=“zfs_member” PARTUUID=“44ddd86f-3ae4-534b-9b12-d09242e5a257”

Also:

mcarifio@zenterprise:/boot$p cd /boot/grub/
mcarifio@zenterprise:/boot/grub$p df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 45488 8136 33768 20% /boot/grub

This is consistent with part 2 of your tutorial.

I had my lenovo w541 bios set to “legacy” booting at first, so I’m guessing /dev/sda1 was never created? In any event, switching it to uefi got it all to work.

Reproducing the error would take a little bit of work and I don’t want to “lose” the disk drive that now has a working 19.10 on zfs.

After running Ubuntu 19.10 with ZFS 0.8.1 for two weeks now, I have the impression that the cache hit rates have improved considerably, since 0.7.x. I do display those hit rates with conky, so they are constantly visible.

I run a 16GB Ryzen 3 and the L1ARC=4 GB and the L2ARC= 35 GB (VMs)+ 10 GB (Data). Everything is LZ4 compressed. I run Ubuntu 19.10 as host with Firefox and I always run a Xubuntu 18.04 Virtual Machine (2GB) with Evolution (Email), Firefox (WhatsApp) and Transmission (Torrents). A couple of times/day I run other VMs for an hour or so, like e.g. Windows 10 (News, Dutch TV player), Windows XP (WMP 11) or one of the 20.04 test versions (ext4).

The Host runs directly from SSD, the VMs run from 3 striped HDD partitions with a 35 GB L2ARC and a 5 GB LOG. The L1ARC is full 3.99 GB (uncompressed 5.58 GB) and the L2ARC is used for 50% and using 21.2 GB (uncompressed 35.7 GB). Those caches were already compressed in ZFS 0.7.x too.

I remember cache hit rates of ~93% for L1ARC and I now often see ~99%. The same happened with L2ARC I remember hit rates between 3% - 8% and now I do see hit rates of 28%-30%. I’m very happy with this improvement, because my HDDs with >6 power-on years are retired, serving and sharing between the three only 0.7% of all disk IO :slight_smile:
Did we had some improvements or bug-corrections in this area?

This is good news! I didn’t read anything specific about this, but as you can see in the commits, there are regular work around cache in upstream ZFS repository: https://github.com/zfsonlinux/zfs/search?q=cache&type=Commits

OK, thank you. I did see at least one clean-up of a module to improve cache read performance.

Things that I would personally rate:

  1. Installer > option for ZFS with encryption and custom partitioning scheme for dual-booting systems (I’m currently doing this manually using LUKS and saying it’s cumbersome is an understatement)
  2. Encrypted boot
  3. A GUI for managing snapshots
  4. Single sign on (encryption and user session)
  5. Encryption active while computer is locked/suspended

I would like to add:

  • Persistent L2ARC, because I have frequent power fails, that require a new start of L2ARC and because of L2ARC I also suspend my system at night instead of powering it off.

Hi @antony-shen, @satadru

I had a very similar experience to yours: stock installation imports existing pool manually ok, but every reboot it disappears. I submitted a bug report in Launchpad, it might be the same issue:

https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1850130

Yes, I can see how just adding vdevs to the pool is a way to go. However, booting from an NVME SSD that can’t be mirrored easily (for example if the mobo only has one M.2 slot) will be a common use-case, and adding additional vdevs risks complete data loss on even if those vdevs are mirrored pairs.

A workaround is to use a systemd unit file to import the pools, but I think the fact that the zpool cache is ignored is a bug and should be treated as such.

Thanks @didrocks for the pointer on Twitter about submitting a bug for this.

Agreed, I have seen very similar improvements in the ARC. I don’t use an L2ARC, but my L1ARC suddenly got fantastic at caching after the upgrade, it was pretty good before, but now it’s amazing.

zfs_arcstats_hitratio-month

No change in usage from before either. Pretty standard desktop usage plus a Windows10 VM. Hourly snapshot replication to a backup drive.

Good to see my observation confirmed with a very good graph. I noticed another improvement. I run weekly incremental backups of my lz4 compressed VMs (320 GB). I use a 1 Gbps Ethernet and it runs at 200 Mbps, due to the CPU load of >90% on the receiving Pentium 4 HT. I noticed that the link now achieves a constant 200 Mbps instead of wildly fluctuating between say 40 and 200 Mbps.

My backup configuration seems to be an invitation for maximizing problems:

  • from a 2018 AMD Ryzen 3 to a 2003 Intel Pentium 4
  • from 64-bits to 32-bits
  • from Ubuntu 19.10 (Linux) to FreeBSD-12 (Unix)
  • from 3 x SATA-3 (striped) to 2 x IDE + 1 x SATA-1 (striped)
  • from 16 GB DDR4 to 1.25 GB DDR

But it just worked out of the box!! And now with zfs 0.8.1 at a constant max speed!

Hello,

On my new laptop, I want to give a try to zfs as rootfs. Installer looks pretty good, no issue at all. But, I really need, at least, to encrypt my home directory, is there a way to do that? Even boot in console to enter the passphrase to start X manually is good for me.

Thanks for you help,

Best regards

Yoann