We’d like to migrate to rust-coreutils this cycle as a new default. People can revert back if needed. Below are the relevant parts of the internal draft spec.
I’d like to upload the changes as soon as possible once the archive is open, probably next week.
Mechanisms for migrating
Migrating coreutils to a new package is an arduous task, as it is an Essential
package which has the requirements that:
- It must work when merely unpacked, files must not disappear at any point
- It must not conflict with other essential packages in files shipped as bootstrapping merely extracts them and then does the proper unpacking runs. Having the same files in multiple packages would lead to indeterminate results.
This ruled out two mechanisms from the start:
- We cannot use alternatives to manage coreutils, because alternatives are configured in the maintainer script - there would be no
cp
and other binaries until we run the maintainer script. - We cannot use diversions and have both coreutils and coreutils-from-uutils (or similar) ship
/usr/bin/cp
- they also would clash at bootstrapping time when we extract the packages [assuming we need to bootstrap with both temporarily], it’s also not particularly clean to have all these diversions on a clean system.
The approach proposed therefore is as follows (“advanced dependency gymnastics”):
- We rename the existing
coreutils
package tognu-coreutils
, and build it with agnu
prefix, e.g.gnucp
(thegnu
prefix has prior use in BSD world, otherwise we should have usedgnu-
). - We introduce a new package,
coreutils-from
(https://git.launchpad.net/~juliank/+git/coreutils-from/) that provides the following binaries:Package: coreutils
Pre-Depends: coreutils-from-uutils | coreutils-from
Essential: yes
Package: coreutils-from-uutils
Pre-Depends: rust-coreutils
Provides: coreutils, coreutils-from
Conflicts: coreutils-from
Replaces: coreutils-from, coreutils (<< ${split})
[optionally] Breaks: coreutils (<< ${split})
Protected: yes
Package: coreutils-from-gnu
Pre-Depends: gnu-coreutils
[… as coreutils-from-uutils … ]
- Explanation:
- We need to use
Pre-Depends
here because we needcoreutils
to be working when unpacked, as it isEssential
, this applies transitively and may be somewhat hard on the solver/ordering… - We need to mark the provider packages
coreutils-from-*
asProtected: yes
to prevent the package manager or the user from trying to switch them, as that would fail: APT would remove the other provider first, and then the binaries would be missing and dpkg would fail (see revert mechanism) - We need the Replaces/Conflicts/Provides between the providers as they are not co-installable. We need to use
coreutils-from
here as we don’t want to conflict with thecoreutils
metapackage. - We need the versioned Replaces and Breaks against the old coreutils such that we can migrate the functionality from it. We can drop the Breaks if they cause issues (there may be a loop here, and apt could decide to upgrade
coreutils
before installingcoreutils-from-*
), but the Replaces are needed for unpacking to succeed.
- We need to use
- The
coreutils
package is empty, it exists just to pull in a provider and to mark the functionality of coreutils as Essential. It’s mostly useful for upgrades. - The
coreutils-from-*
packages contain symlink farms for commands (potentially mini wrappers, see security impact), manual pages, and completions, such as:- /usr/bin/ls -> /usr/bin/coreutils
- /usr/share/man/man1/ls.1.gz -> /usr/share/man/man1/rust-ls.1.gz
Related Debian work
Debian is working on the same but sort of opposite thing, namely, supporting toybox
and busybox
as alternative providers of the coreutils
functionality to provide smaller minimal images. The scheme has been proven to work in their experiments (https://salsa.debian.org/josch/busybox-is-coreutils-demo/).
Impact on minimal images
Assuming that we eventually want to drop the gnu coreutils to universe, and only support the rust-based version, we are looking at a significant problem for those stories:
Larger image size: A Docker image currently is 75 MB large. Rust-coreutils come in at 25 MB vs 7 MB for the classic coreutils, increasing the image size by 18MB to 93MB (+24%).
This can be worked around by continuing to commit to classic GNU coreutils on those platforms, but this increases the overhead of having to support and validate two implementations.
Security impact
AppArmor profiles don’t work correctly with a multi-call binary. A profile allowing /usr/bin/ls
now needs to allow /usr/bin/coreutils
(as AppArmor profiles follow symbolic links) and there is no way to identify which of the tools is being called in the profile. The upstream project may want to define an apparmor profile for coreutils
with “hats” for the individual tools and then switch the hat on initialisation, but this only solves the issue partially (doesn’t help for inherited profiles, for example).
This solves some problems, but there may be more:
- Build tiny wrapper binaries that ensure the coreutils binary is called with the right first argument, that is,
/usr/bin/rm
essentially doesargv[0] = “rm”; execv(“/usr/bin/coreutils”, argv)
- this ensures that an apparmor profile that does e.g./usr/bin/rm Ux
works safely, but doesn’t necessarily work for other things. - Build coreutils into a dynamic library that simply exposes
coreutils_main()
and then call that from tiny wrapper binaries
Testing
With the changes as described above, the existing tests that get triggered for coreutils
will automatically be triggered for coreutils-from-uutils
, as the new coreutils
depends on it (and it is installed in the chroots).
This doesn’t cover everything, as the coreutils are “essential” and not everything depends on them explicitly.
Upgrades
Per above, on upgrade the new coreutils
binary from the coreutils-from
source package will be installed, which will pull in coreutils-from-uutils
as it is the leftmost dependency.
Revert mechanism
The simple mechanism that should work is
apt install coreutils-from-gnu coreutils-from-uutils- --allow-remove-essential
But we have a couple of limitations in that APT tries to remove packages first, the binaries would go missing, and dpkg would then fail to unpack the new coreutils. This is an APT limitation and can be fixed in APT, dpkg supports just installing the new package and will remove the other package itself; that is, this works fine:
apt download coreutils-from-gnu
dpkg --install ./coreutls-from-gnu*.deb
(will remove coreutils-from-uutils automagically)
However, to work with existing APT, we can adopt protective diversions: For all coreutils providers:
- In their
prerm
script, rundpkg-divert –no-rename –add
for all coreutils to make dpkg think the files have a different name and not remove the real ones. - In their
preinst
script, we rundpkg-divert –no-rename –remove
to remove those protective diversions, in turn when our provider is unpacked it will override the leftover protected binaries.
See prerm.in
and preinst.in
in https://git.launchpad.net/~juliank/+git/coreutils-from/tree/debian?h=main for the details.
So when you switch from coreutils-from-a to coreutils-from-b:
- coreutils-from-a prerm adds the diversions, making dpkg think that, for example,
ls
isls.remove-bak
- coreutils-from-a removal does effectively nothing to the files because it will try to remove
ls.remove-bak
which does not exist, and keepls
untouched - coreutils-from-b preinst undoes the diversion
- coreutils-from-b is being extracted, and as there is no diversion it will take over
ls
Alternatively, when APT is fixed/dpkg is also used directly, it executes in the order of 1, 3, 4, 2 which also works (as the file ownership of ls
moved to coreutils-from-b
by the point we remove coreutils-from-a
files).
Known issues
basenc
,stty
, andfactor
are missing in rust-coreutils 0.0.30-1- cp, ls, mv: Implement SELinux context handling
- coreutils date -ud ‘-2 weeks’ produces error: unexpected argument ‘-2’ found #7515 (pending release)
- Some packages declare versioned depends
coreutils (>= version)
we do not handle this.