Enhancing our ZFS support on Ubuntu 19.10 - an introduction

didrocks · November 20, 2019, 7:20am

Until there is native support in the installer (probably post-LTS), I think your best bet is to use ecryptfs directly, on your home directory: Security - eCryptfs.

mistur · November 20, 2019, 7:38am

this is what I currently have on my setup, but for my new one, I really want to use ZFS, there is no solution to have the home directory zfs volume encrypted?

mistur · November 20, 2019, 9:27am

This looks interesting :

I’ll try this !

mistur · November 20, 2019, 2:29pm

It works very well! it ask me a password to load the key at the beginning of the boot. Then, I can login to my session with my password.

Yoann

lammert-nijhof · November 22, 2019, 6:24pm

Monday I will have an exiting day, since I should receive my 512GB NVME SSD from Newegg. I will put the ZFS experimental system on it, but I do NOT want to use the WHOLE DISK.
At midnight I expect the system storage to look like:

PLANNED END RESULT

The NVME SSD Silicon Power (sda) of 512GB:

ext partition + bpool 2.1GB the standard Ubuntu stuff as generated by the install.
rpool 37.9GB the standard Ubuntu pool, but REDUCED in size.
freespace 15GB as fallback to restore the 55GB of current ZFS installation, if needed.
L2ARC for a HDD datapool dpool, say 15GB (or 30GB), no ZIL/LOG since that dataset hardly uses sync writes.
vpool 442GB as storage for my virtual machines.

The HDDs 500GB Seagate (sdc) and 1TB WD-Black (sdb):

archives pool, unchanged 584GB at sdb4.
Ubuntu 19.10 on ext4, 18GB at sdc5 copied from WD-Black to Seagate.
Swap, 18GB at sdc6, to try hibernation in future.
dpool, 870GB striped (454 + 416 GB) with one dataset with copies=2.

PREPARATION

Friday, I installed Ubuntu 19.10 minimal install on a 40 GB USB disk (IDE). The rpool ashift=12, so that is OK. The system functions as Virtualbox host and only needs system utilities and Firefox.
I booted the installation from the USB HDD. I prepared the USB system as far as possible; conky with tricks for zfs, zram/lz4; installed hddtemp and lmsensors. That systems will be moved to the NVME SSD with “sudo dd /dev/sdd /dev/sda”. It works I have tested it in Virtualbox.

Saturday is my weekly backup day, so I will incrementally backup all datasets (send/receive). I will create two additional partition image files (gnome disk utility) for my current zfs install and for the ext4 install. If something goes terribly wrong, I can restore my current ZFS systems partition. I reduce the number of snapshots for each dataset to two.

Sunday I will use the new ZFS 0.8 device remove command and remove a 320GB laptop disk from its zfs datapools.

Monday

destroy ZFS caches and logs, export datapools;
physically insert the NVME disk;
physically remove the SATA-SSD (128GB) and laptop HDD (320GB);
boot from ext4; check drive assignment sda (NVME),sdb (Seagate) and sdc (WD);
move the installation from USB to NVME: “sudo dd /dev/sdd /dev/sda”
remove USB; update grub and reboot from NVME;
create other NVME partitions with gnome-disk-utility;
create vpool datapool and datasets;
move the virtual machines to vpool on the NVME;
delete the old dpool and vpool partions.

Reorganize the HDD partitions as indicated and restore dpool from the backup over my 1 Gbps link.

Begin December I will try dual boot by adding Mate to the existing bpool and rpool. This time I will install Mate on the USB disk and send/receive the content of bpool/BOOT/Ubuntu; rpool/ROOT/Ubuntu and rpool/USERDATA/ to the existing rpool and bpool. Update grub and look what happens

RESULTS till THURSDAY:
Most of the operation went as planned, however after I moved the system from USB disk to NVME disk, the system did not want to boot from NVME, even not after running update-grub. I think there is a difference in booting from SATA and NVME. So I went to plan B, and installed the system from scratch on the NVME disk.
I introduce two datasets for my virtual machines rpool/VBOX and rpool/VKVM and moved my VMs to those datasets.

I’m happy with the end result! The system is really fast and Host and VMs boot twice as fast In the past the host booted from SATA-SSD and the VMs from L2ARC and 3 striped HDDs.
I measured the disk throughput in the VMs with Crystal Disk Mark and got constantly ~900MB/s in Win 7. In the gnome-disk-utility I did get the first time ~620MB/s and the following measurements where ~1100MB/s.
I can’t yet explain those differences, but I like the disk throughput in my VMs.

I noticed, that by default Ubuntu only added a swap partition of 2 GB, so I added the /dev/sdb swap partition of the ext4 system and did give it a lower priority. so now my total swap is 18GB.

I have added 4 scripts to create a snapshot for all system datasets, to destroy those snapshots and to rollback those snapshots. The 4th script is for changing the canmount property of all system datasets, to allow updating zfs in the ext4 system too. I set everything to canmount=noauto, but I had to change it to “on” for the userdata, apt and dpkg related datasets. Look how that develops in future.

satadru · December 2, 2019, 11:23pm

Any chance of instructions for installing 19.10 with zfs root on arm64 using some variant of the experimental installer?

Modifications of these steps seem to work to create the root filesystem: https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS

( I have this working with zfs root on the RPI 4B 4Gb.)

But one has to do an initial install and then rsync everything over to the zfs pools, and we then miss the steps for creating the hash-named root and userdata zpools, and whatever needs to be done to enable the zfs userdata folder creation (is that zsys in this context?)

didrocks · December 3, 2019, 7:18am

We only work on the desktop support right now. Once this one is stable (or if anyone in the community is interested in starting with beforehand), we’ll probably move it to other platforms.

Right now, the HOW-to you pointed is the only way I know of for this.
The other way would be to tweak a curtin installation (a little bit similar than https://github.com/ubuntu/zsys-install/blob/master/curtin-zfs.yaml, once refreshed) and integrate this into subiquity.

lammert-nijhof · December 6, 2019, 6:43pm

After installing ZFS on my NVME drive, I did run into problems with the connection with my backup server based on send/ssh/receive. I used, it since June for approx 1-2 hour/week to create incremental backups and it worked fine. My backup server is a Pentium 4 (32 bits) with 2 IDE HDDs and 2 laptop HDDs on SATA-1 in total 1114 GB according to FreeBSD 12-1. I’m retired and Dutch, so I throw nothing away and I reuse stuff as much as possible.

I lost compatibility, because of the new feature “large-dnode”, that had been enabled and activated by Ubuntu 19.10. The support in FreeBSD for the feature is in doubt! After trying to receive data from Ubuntu, FreeBSD activated “large-dnode” also, but it did not help. In the matrix of supported features, OpenZFS states FreeBSD does NOT support it.

With some help from Richard Laager, I restored compatibility by setting for every own dataset “dnodesize=legacy” on both systems. It took me one very long day to find the solution. I also had to restore all my data to those datasets with this property and destroy the old datasets (500GB). Those actions took another long day. I have now a lot of pain in my mouse-arm

I know the new OpenZFS setup will improve this compatibility issue, but it will not completely solve it. There still might be a couple of months difference between implementations. So I would propose: “Do not activate features silently, that might hamper send/receive compatibility”.

I have a question too:
When do we hear, what will be implemented in Ubuntu 20.04 and when can we start testing and enjoying the first parts of it.

lammert-nijhof · December 8, 2019, 3:55pm

Now I do run Ubuntu/ZFS from a fast NVME drive (3400/2300 MB/s), I noticed some change. In the past my reboot times of Ubuntu Virtual Machine (VM) from memory cache would be say 12 seconds, Boot times were between 25 and 50 seconds dependent on the fill grade of the SSD cache. Using the NVME drive that difference disappeared, I always boot the VM in 12 seconds, maybe the reboot from L1ARC is 1 second faster.
So it has some consequences for desktop use:

I’ll already start reducing the max size of L1ARC, since caching does not really speed up my NVME drive on my Ryzen 3 2200G. While my HDD stored music, videos and spreadsheets, do not require caching
Maybe part of the issue is the LZ4 decompression CPU time, maybe we should avoid compression only to speed up a NVME drive.
Or maybe we should be able to stop caching the data of NVME drives and only cache the drive administration.

dcominottim · January 20, 2020, 1:58am

Hello there!

Thank you so much for this great initiative in bringing ZFS to desktops!

Please, I’d like to know where I could submit a feature request for ZFS’ “copies” option to be available in Ubiquity? It’s one of the most interesting features for single drive configs, but in order to be really useful, it needs to have a value specified at partition/pool creation time, otherwise it’ll only apply to newly written files after the installation.

jibel · January 20, 2020, 8:38am

Hi @dcominottim and welcome to the community hub,

We’re glad you enjoy ZFS on Ubuntu desktop. Could you elaborate a bit and explain why you think it’d be useful to enable “copies” at installation time on single disk setup?

dcominottim · January 20, 2020, 6:19pm

Thank you a lot for the reply, Jean!

In the context of desktop/laptop Ubuntu installs, it seems reasonable to assume that many users will have a single drive dedicated to the OS or even at all (be it a NVMe SSD, SATA SSD, or HDD). Although the full benefits of ZFS’ redundancy features, etc. would require multiple drives to offer maximum data protection, the ‘copies’ feature can be an interesting option for users who don’t have/can’t use multiple drives for redundancy purposes but still want some extra protection against bitrot and similar issues. But to have the full benefit of ‘copies’, it needs to be enabled – if desired – before the OS install, otherwise it’ll only be applied to newly written data after the install (also useful, but less so).

Just to be clear, I do not propose that the default value from ‘copies’ should be changed from ‘1’ to ‘2’ or ‘3’, but just that the installer should let the user specify ‘2’ or ‘3’ if he so desires. After all, one of the major benefits from ZFS and reasons people want to use it is due to extra data protection offered by features such as this.

jibel · January 22, 2020, 2:13pm

We received several requests to enable different features of ZFS for different use cases like your suggestion or encryption for example. They are all valid but at this point we cannot integrate them all in the installer especially for an LTS.

Although in order to allow everyone to customize their ZFS installation we are are looking for a way to override the default set of pool features and filesystem properties currently in the installer maybe based on a configuration file.

If some features or properties are broadly used, not currently set as default and safe to enable we can still enable them by default or add an option in Ubiquity for next point release of the LTS.

We are still looking into this and will keep you posted when something is ready to test in Focal.

dcominottim · January 22, 2020, 9:17pm

Thanks for the kind and insightful reply, Jean!

Question: if I make a new install using the latest Focal daily image, is it already expected to contain the dataset configs, etc. that will be used in the final version? Naturally, I know something new might happen between now and then, but just wanted to know whether all or most ZFS/disk related stuff (in terms of partitions, datasets, etc.) is expected to remain more or less the same as in the current daily images.

jibel · January 23, 2020, 5:32am

Yes, unless there is a good reason to change it, the layout of the pools and datasets will remain the same as it is today.

lammert-nijhof · January 23, 2020, 3:08pm

Might I suggest to allow to change the size of rpool during installation, leaving empty space at the end of the disk for e.g L2ARC, an ext4 partition or another datapool. The same but less important would be the size of SWAP, 2GB is small and not really useful for hibernation.
A second suggestion would be to allow another second installation in the same bpool/rpool datapools. I would use it to try the development edition on real hardware instead of in Virtualbox.
I think the changes would only influence the installer leaving the remainder as-is.

I did run into a 3 days issue with the features (large dnode), you did choose for rpool, I could not backup my user datasets from rpool to FreeBSD anymore. I think you should ask the user with which systems they require feature compatibility and set the features of rpool accordingly. On the other hand allowing to change the size of rpool would allow an educated user to select those features themselves for a user datapool on the same disk.

averyfreeman · February 3, 2020, 11:50pm

The way you recover your data is from a backup you take on a regular basis, just like any other file system

mfdes · March 3, 2020, 11:14pm

The bug I mentioned above:

Also affects the current (3rd March) build of Focal. Since this will be an LTS, I’d like to draw attention to it, since it makes ZFS root unusable for all but casual stuff, and prevents the use of external zpools.

What is the best way to do this? I feel like opening another bug in Launchpad would be redundant.

didrocks · March 6, 2020, 7:26am

We didn’t have the time with the new features to implement first, but we should get to it in the next couple of weeks and come back asking more questions if we can’t reproduce.

mfdes · March 6, 2020, 9:48am

No worries, thanks @didrocks. So far it’s been simple to reproduce, it’s been happening 100% of the time for me on fresh stock installs using ZFS, both bare-metal and in KVM virtual machines using virtio drivers for storage.