ZFS focus on Ubuntu 20.04 LTS blog posts

Hey everyone!

Here is the place to discuss about my new blog post series on the work we have done in Ubuntu 20.04 LTS on ZFS experimental support and ZSys.

This is the place to discuss about the content and I’ll update this first post to reference any new episods.

The first topic is available here:

As a reminder, we had 2 blog posts on Ubuntu ZFS support in 19.10:

16 Likes

So might we see encrypted ZFS snapshots in the 20.10 release or later?

1 Like

Zfs encryption is indeed on the roadmap for 20.10.

4 Likes

Second blog post is now published: ZSys general presentation

4 Likes

As someone who has heard good things about ZFS but wasn’t technically up to setting it up and managing it, I’m looking forward to trying this soon.

@didrocks After reading the “ZSys general presentation” blog post, may I suggest changing the GRUB date format to ISO-style year-month-day (e.g. 2020-05-26)? It looks like zsysctrl uses this format, and it is probably the most unambiguous numeric format.

4 Likes

As an European where everything is using the same date format, I can tell you the ISO-style doesn’t make sense at all to me :slight_smile:

The good news is that the date format is using the system parameterized format, for both GRUB and zsysctl (my zsysctl screenshots were taken in a VM with US default configuration, hence the different format).

2 Likes

And is there any possibility that this will also be backported to a 20.04 point release?

1 Like

Yes but it is not decided yet which point release. It, for sure, won’t be for .1 at end of July which is focused on bug fixes and small enhancements.

3 Likes

And here we go with the third blog post on ZSys state management!

2 Likes

Hey @didrocks, many thanks for putting together this series of posts.

I must say, I’m looking forward to seeing how the multiple machines functionality works. It looks like it will be a good way to run different ubuntu releases and flavours on the one machine, while still sharing the same home directory. (If I properly understand how this works.)

Regards, Mal.

New blog post! Today on ZSys commands for state management!

2 Likes

What is the goal of user IDs, such as at the end of rpool/USERDATA/root_cjoc63? What constraints do they have?

The 20.04 installer uses the same ID for both the root user and the regular account. Looking at the script (lines 552-560), it uses the same ID for any user it finds. This is different from the ID used by the machine though.

The OpenZFS guide (see “Create datasets”) uses the same ID for the root user as for the machine, but then later generates another ID for the regular user.

Both of these seem strange to me. I would expect them to all be different, with no reuse.

There is no impact of using the same or different IDs between users and system. Installation time creates the same ID and apply to any created datasets. After a revert or so, it will apply different ones.

The association isn’t made by ID, so this is just an implementation details without any impacts. A future blog post will detail more on the association and how this all works :slight_smile:

Is there a way to disable autozsys snapshots on selected datasets?

For example I have separate datasets set up for Videos, Music, Virtualbox machines and other data that are space hungry. I don’t want snapshots of these - if I delete files from them I want the space freed up. I can do this with zfs-auto-snapshot, but so far I haven’t found a way to selectively turn off zsys user backups.

Yes! You will need to turn those as separate persistent datasets (under rpool/

Yes, you can do it with moving those datasets outside of <pool>/USERDATA/ namespace (same for <pool>/ROOT/ on system datasets), either by moving them directely under / or by recreating a hierarchy with mountpoint inheritance in <pool>/home/<youruser>/Videos for instance. This will be all explained (using system datasets, but same applies for user datasets) in great details in part 8 of the series.

2 Likes

Great! Thank you @didrocks, looking forward to reading the rest of your series as they come out

1 Like

You’re welcome! And speaking of which, after creating a lot of state saves in the previous posts, it’s now time to explain how we garbage collect them: ZSys state collection.

Can we have a way to “demote” a manual save to an automatic save?

Suppose the scenario from the state collection article:

-----------|---------------|-----------------------------K------------------------> (time)

where K is a manual save I made for some purpose, and we’re trying to keep three saves in that time interval. If I now decide I don’t need K anymore for its original purpose, I likely wouldn’t notice that it’s also a save inside a considerable time interval, and deleting it would get me this:

-----------|---------------|------------------------------------------------------> (time)

I could of course check the dates and times of surrounding snapshots before making a decision but in practice that’s tedious and will be forgotten.

Instead I would prefer to just demote it, which I suppose could be done by simply renaming it to autozsys_(some id):

-----------|---------------|-----------------------------|------------------------> (time)

Now zsys can take over and delete it whenever the garbage collection policy allows.

It would probably be useful to have a “promote” command as well. Suppose I notice something went rather wrong with my files in a complicated way, and I see that the auto snapshot from three hours ago was okay. I would then promote that one to a manual save with a name like “lastgood” or something, which both makes sure it won’t be garbage collected in the next day (most hourly snapshots would be) and lets me easily identify it. I can then sort out my files at my leisure.

The garbage collection rules are consecutive and do not overlap, right?

The actual rules imply this is the case, but the names confused me at first. From what I understand (and ignoring the 20 save limit for the moment):

  • Today is implicitly never GCd.
  • PreviousDay keeps 3 saves for yesterday (the entire interval is the day one day ago).
  • PreviousWeek keeps one save per day for the five-day interval of two days ago to six days ago.
  • PreviousMonth keeps one save per seven days for the 28-day interval of seven days ago to 34 days ago.
  • PreviousYear keeps one save per 30 days for the 330-day interval of 35 days ago to 364 days ago.
  • Previous2Years keeps two saves for the 365-day interval of 365 days ago to 729 days ago.

Not directly the answer to it, but note that we will start working soon on a zsysctl state reclaim commands to identify which states (because it’s most of the time a range of states, not a singleone) that are taking the most spaces and you can then manually remove with zsysctl state remove.

Exactly! Another way (which is a one time thing), is to run zsysctl service gc --all to account for all snapshots, including manual one, when we compute the balancing of snapshots to keep.

I think in that case you would rather revert to that state in grub, won’t you? All snapshots will then have a dependency on previous filesystem datasets (which is thus becoming a clone of current dataset) and no automated collection will be done until the previous filesystem datasets are removed.

If there are other use-case and we want to “pin” that state (which will thus applies easily to snapshots, but less to clones), indeed, this could be a point of improvements.

This is exactly what you illustrated in your example :slight_smile: Phew, this was hard to articulate (especially as a configuration file, but it seems it was understandable afterall (and only pointed at advanced users of course).