Testing is an important part of Ubuntu. It happens at many levels, from individual manual testing of bugs, to unit tests in source code, all the way to testing Ubuntu ISOs and pre-installed images (such as those used in the Public Clouds). My focus for the first 4.5 years of work at Canonical was on Public Cloud images, with some particular efforts in CI/CD pipelines and testing. This past year in Ubuntu Engineering, I’ve gotten to know the package testing story more intimately. Along with @uralt and @paride , I’ve been helping to make plans on the next iteration of autopkgtest-cloud and package testing infrastructure at Canonical. Now I’ll take you on the package build, test, and release journey that I’ve been on.
The Lifecycle of Packages
@mclemenceau wrote a nice blog explaining update-excuses that includes a simplified Package lifecycle diagram. What I’m focusing on is the far right piece, labeled britney, and extending the information around package testing.
Britney is the heart of the CI system for Debian and Ubuntu. It evaluates uploads to see if they are able to migrate between pockets. Policies are set that determine migration. I won’t go into detail on the policies and pockets. Let’s focus on the testing aspect.
When a package is uploaded and tests are required, britney queues a message. Ubuntu’s autopkgtest-cloud infrastructure pops this message from the queue, and starts up a test process. Upon completion, test results are saved to an object store. Britney then retrieves information from the autopkgtest-frontend and determines next steps.
How We Test – autopkgtest-cloud

Above is a wonderfully crude diagram of the autopkgtest-cloud infrastructure. Britney queues a message into a bus. The autopkgtest-dispatcher service consumes the message. Autopkgtest-dispatcher is the service that runs the autopkgtest instance against remote, ephemeral environments. In the current implementation, these are openstack compute nodes for all architectures other than armhf. Autopkgtest-dispatcher handles the lifecycle of the remote instances, executing tests utilizing the autopkgtest-virt-ssh backend. Results are pulled from the ephemeral vm and put into a cloud object store. A separate service, autopkgtest-frontend, hosts the static autopkgtest.ubuntu.com site. This service reads from the object store, and dumps the results into a localized sqlite database. This database is available for download from the main page (just click SQLite database with results (sha256) to download).
Autopkgtest source documents how tests are defined. The test cases and control file are a part of the Debian packaging directory. Autopkgtest has many different virtualization backends, including virt-null which operates locally, virt-lxd which will execute Linux containers or virtual machines, virt-docker, and many more. As mentioned, the Ubuntu testing infrastructure currently uses virt-ssh to connect to a test system controlled ephemeral Openstack VM.
What We Test
Autopkgtests are installed package integration tests. This means tests are meant to exercise that the package is installable and operational on an Ubuntu system. They are not extensive unit tests or integration tests for specific environments. Any Debian package being published to the primary archive (archive.ubuntu.com) is eligible for testing, regardless of pocket (main, universe, multiverse, restricted). Not included in the testing listed above is security.ubuntu.com. The Canonical Security organization controls the flow of packages into the security archive, and they have their own infrastructure in place. They do utilize britney and autopkgtest as well, but have their own deployments and practices separated from the above diagram.
This is also different from the tests executed during a package build. Package builds often run unit tests as included with the originating source. This post won’t get into package build steps, but know that, in many cases, tests included in source are run at package build time.
The tests themselves vary a great deal. In the past few months, I’ve noted a few different examples that show testing intent, but with very different implementations and ramifications.
Valgrind
Valgrind takes the ideal of “installed and runnable” to a far limit. The Debian test control file is just about as minimal as can be. It executes a single command valgrind /bin/true . It also allows for std-err as valgrind will post to std-err if there is an issue with the binary under test. This test does not exercise all the capabilities of valgrind, nor is it meant to evaluate true for possible issues. It proves that valgrind is installable and runnable on the system, and that’s about it. I have a personal backlog item to investigate more realistic applications and how to run them, to exercise valgrind a bit more. This is a recommendation from valgrind upstream in their README_PACKAGERS.
Procps
Props is a fine example of “installed and runnable” and a test for specific Ubuntu-correctness. The test control file lists two tests, a stack-limit test and a systemd-sysctl test. There are more tests available in the tests directory. test-sysctl-defaults.py tests specific sysctl configuration for the package in Ubuntu, which has proven to be useful. Over time, procps sysctl defaults have had to be updated, and we can see some churn on the test itself to ensure we are catching regressions.
Gdpc
This was a brand new package for me, that I had to dive into thanks to a procps bugfix I’m trying to shepherd through. Gdpc is a universe package for visualizing output data from molecular dynamic simulations. Honestly, not my wheelhouse as a former Classic musician and interactive multimedia composer. It had failing tests showing up in proposed-migration though, so I dove in. The tests run using xvfb, a virtual framebuffer, “fake” X server. It’s a handy utility for running a graphical program in a server environment for testing purposes. The tests execute an installed gdpc using the same tests files as the built-in tests. This means we test the installed version in the same manner as the built version, which is a useful style of integration test.
It also highlights the many possible ways autopkgtest can fail.
Proposed Testing
In my case, procps migration triggered reverse-dependency tests – packages that depend on procps had their tests queued, installing procps from proposed. The gdpc tests failed with a buffer overflow , certainly something to investigate. I checked as best I could, and it seemed to fail on the run of gdpc itself. Digging in procps was a dependency not of gdpc as installed, but of the tests, as declared in the debian/tests/control file. This led down a “fun” hole, where I tested locally using autopkgtest-virt-lxd, updated the run-unit-test script to add set -x to ensure my hunch on what command was failing, and never seeing it fail locally. This led to comparing environments, seeing package differences, and working to rectify those (notably there was a different kernel on my local run). Finally, unable to reproduce locally, This led me to try running gdpc with migration-refence/0. This special reference means, basically, “run the tests again, without any proposed packages. If it fails there, not that it has failed not due to the package in proposed.” This failed, which means passed for procps (not a migration blocker), but I’ve found some sort of bug somewhere in gdpc that I’m still spending my free time sorting out. As is, I lack information to open a bug, because I cannot seem to reproduce this anywhere outside autopkgtest infrastructure, but I’ll keep trying. If anyone is interested in helping out, a link to the failing tests and a package in a ppa with procps that has my set -x change.
The Future
@uralt posted a blog recently about some of our future visions for autopkgtest-cloud. This shows a future vision that is slightly different. First, we are moving from launching large amounts of ephemeral virtual machines to launching large amounts of ephemeral containers utilizing lxd-clusters. For those that have worked at large scale in the clouds, you may already see where possible gains can happen. Whenever you’re running large amounts of ephemeral VMs (which we can chew through thousands of launches a day), you can hit all sorts of edges, from the known nice ones like hitting quota issues or noisy neighbors, to unknown failures to launch. While moving to lxd-clusters isn’t a magic bullet, we’re hoping that alleviating constant churn on Canonical’s provided Openstack cloud will help with stability.
There are also some discussions around expanding and improving tests. When I was on the Public Cloud team, we continuously updated and expanded our testing capabilities, covering more and more cases to ensure that Ubuntu runs well everywhere. In Ubuntu, this means adding tests to packages missing tests, or fixing up tests to be more comprehensive.
Finally, there are ongoing discussions on how we can generally improve the development workflow in Ubuntu. Right now there’s nothing linking autopkgtest-cloud and Launchpad. If I’m tracking “is my package ready to migrate?” I’m checking in several sources, across multiple sites, and doing some searching across files and pages. While discussions are very nascent, we in Debcrafters are turning our eyes toward these problems. Look out in this space to see more specs and blogs expanding on how to improve Ubuntu development processes.