As someone who has heard good things about ZFS but wasn’t technically up to setting it up and managing it, I’m looking forward to trying this soon.
@didrocks After reading the “ZSys general presentation” blog post, may I suggest changing the GRUB date format to ISO-style year-month-day (e.g. 2020-05-26)? It looks like zsysctrl uses this format, and it is probably the most unambiguous numeric format.
As an European where everything is using the same date format, I can tell you the ISO-style doesn’t make sense at all to me
The good news is that the date format is using the system parameterized format, for both GRUB and zsysctl (my zsysctl screenshots were taken in a VM with US default configuration, hence the different format).
Hey @didrocks, many thanks for putting together this series of posts.
I must say, I’m looking forward to seeing how the multiple machines functionality works. It looks like it will be a good way to run different ubuntu releases and flavours on the one machine, while still sharing the same home directory. (If I properly understand how this works.)
What is the goal of user IDs, such as at the end of rpool/USERDATA/root_cjoc63? What constraints do they have?
The 20.04 installer uses the same ID for both the root user and the regular account. Looking at the script (lines 552-560), it uses the same ID for any user it finds. This is different from the ID used by the machine though.
The OpenZFS guide (see “Create datasets”) uses the same ID for the root user as for the machine, but then later generates another ID for the regular user.
Both of these seem strange to me. I would expect them to all be different, with no reuse.
There is no impact of using the same or different IDs between users and system. Installation time creates the same ID and apply to any created datasets. After a revert or so, it will apply different ones.
The association isn’t made by ID, so this is just an implementation details without any impacts. A future blog post will detail more on the association and how this all works
Is there a way to disable autozsys snapshots on selected datasets?
For example I have separate datasets set up for Videos, Music, Virtualbox machines and other data that are space hungry. I don’t want snapshots of these - if I delete files from them I want the space freed up. I can do this with zfs-auto-snapshot, but so far I haven’t found a way to selectively turn off zsys user backups.
Yes! You will need to turn those as separate persistent datasets (under rpool/
Yes, you can do it with moving those datasets outside of <pool>/USERDATA/ namespace (same for <pool>/ROOT/ on system datasets), either by moving them directely under / or by recreating a hierarchy with mountpoint inheritance in <pool>/home/<youruser>/Videos for instance. This will be all explained (using system datasets, but same applies for user datasets) in great details in part 8 of the series.
You’re welcome! And speaking of which, after creating a lot of state saves in the previous posts, it’s now time to explain how we garbage collect them: ZSys state collection.
where K is a manual save I made for some purpose, and we’re trying to keep three saves in that time interval. If I now decide I don’t need K anymore for its original purpose, I likely wouldn’t notice that it’s also a save inside a considerable time interval, and deleting it would get me this:
Now zsys can take over and delete it whenever the garbage collection policy allows.
It would probably be useful to have a “promote” command as well. Suppose I notice something went rather wrong with my files in a complicated way, and I see that the auto snapshot from three hours ago was okay. I would then promote that one to a manual save with a name like “lastgood” or something, which both makes sure it won’t be garbage collected in the next day (most hourly snapshots would be) and lets me easily identify it. I can then sort out my files at my leisure.
Not directly the answer to it, but note that we will start working soon on a zsysctl state reclaim commands to identify which states (because it’s most of the time a range of states, not a singleone) that are taking the most spaces and you can then manually remove with zsysctl state remove.
Exactly! Another way (which is a one time thing), is to run zsysctl service gc --all to account for all snapshots, including manual one, when we compute the balancing of snapshots to keep.
I think in that case you would rather revert to that state in grub, won’t you? All snapshots will then have a dependency on previous filesystem datasets (which is thus becoming a clone of current dataset) and no automated collection will be done until the previous filesystem datasets are removed.
If there are other use-case and we want to “pin” that state (which will thus applies easily to snapshots, but less to clones), indeed, this could be a point of improvements.
This is exactly what you illustrated in your example Phew, this was hard to articulate (especially as a configuration file, but it seems it was understandable afterall (and only pointed at advanced users of course).