CI for the Debian kernel team

Starting just after Christmas, I have been working on CI for all the kernel team's packages on Salsa. The salsa-ci-team has done great work on producing a common pipeline that is usable for most packages with minimal configuration. However, for some packages a lot more work was required.

Linux

I started with the most important package, linux itself. This now has about 1.1 GiB of source spread over 76,000 source files. That turns out to be a problem for the pipeline which currently puts unpacked source in artifacts - it is far beyond the limits of what Salsa allows. I worked around this by using modified versions of the extract-source and build jobs that use packed source package as the artifacts. The output of the build job is compatible with the common test jobs.

The linux package also takes a lot of resources to build; around 80 minutes on the fastest PC I have at home (if ccache is not primed). Salsa's shared CI runners seem to be about 10 times slower than that, so it is completely unfeasible to do even one full build in CI. Instead I defined a new build profile that includes only the smallest kernel configuration, without debug info, and the user-space packages. This still takes over an hour with the Salsa CI runners, but I don't think we can improve this much without losing a lot of code coverage.

Our Git repository for linux also does not contain the upstream source, so the extract-source job has to fetch that. The common extract-source job uses origtargz to do that, and in case the orig tarball is not already in the archive this will run uscan. That led me to a new problem: our debian/watch file could only find tarballs linked from the front of www.kernel.org, and we're sometimes working with different upstream versions. There is actually no single page listing all tarball releases of Linux, and tarballs for release candidates are dynamically generated by CGit and unsigned. So I changed debian/watch to fetch from Git, which is what we were already doing with our own genorig.py script.

Unfortunately, running uscan against a Git upstream, with some files excluded (as there are still a few upstream files we consider non-free), is about twice as slow as it could be. Since I had to modify the extract-source job anyway, I've continued using genorig.py there.

A full build log for linux is over 200 MiB, and even with the reduced build profile it would be much longer than Salsa's limit of 2 4 MiB. I therefore opted to use the 'terse' build option (which translates to V=0), but made the builds of user-space tools ignore this option so that blhc could still do its work. (The kernel itself cannot use the same hardening options, so blhc is not useful there.)

Finally, with the CI pipeline running, blhc and lintian showed a lot of problems that we hadn't been attending to. I've fixed all the blhc errors (with some careful suppressions), all the lintian errors, and the most straightforward lintian warnings.

firmware-nonfree

The firmware-nonfree package also has huge "source" (about 560 MB) and needed some of the same modifications, but is quick to build so did not require a special build profile.

Running lintian over firmware-nonfree reminded me that I needed to sort out the unsuual and inconistent handling of machine-readable copyright information in this source package. I had already done most of that work on a private branch in 2020, so this is mostly ready but I still need to resolve a licensing issue with AppStream metadata.

Other packages

For kernel-handbook, there was already a trivial "CI" pipeline used to push static pages to the web site. I've replaced this with the common pipeline plus a job that will push the pages from each build on the master branch.

For everything else, it was straightforward to enable the common pipeline with a little bit of configuration.