Email: firstname.lastname@example.org • Mastodon: @email@example.com • Debian: benh (Salsa, QA) • Gitweb: git.decadent.org.uk • GitHub: bwhacks
I'm attending the Linux Plumbers Conference in Lisbon from Monday to Wednesday this week. This morning I followed the "Distribution kernels" track, organised by Laura Abbott.
I took notes, included below, mostly with a view to what could be relevant to Debian. Other people took notes in Etherpad. There should also be video recordings available at some point.
Speaker: Bruce Ashfield, working on Yocto at Xilinx.
Yocto's kernel build recipes need to support multiple active kernel versions (3+ supported streams), multiple architectures, and many different boards. Many patches are required for hardware and other feature support including -rt and aufs.
Goals for maintenance:
Other distributions have similar goals but very few tools in common. So there is a lot of duplicated effort.
Supporting developers, distro builds and end users is challenging. E.g. developers complained about Yocto having separate git repos for different kernel versions, as this led to them needing more disk space.
Speaker: Senthil Rajaram & Anatol Belski from Microsoft
Microsoft chose Yocto as build tool for maintaining Linux distros for different internal customers. Wanted to use a single kernel branch for different products but it was difficult to support all hardware this way.
Maintaining config fragments and sensible inheritance tree is difficult (?). It might be helpful to put config fragments upstream.
Laura Abbott said that the upstream kconfig system had some support for fragments now, and asked what sort of config fragments would be useful. There seemed to be consensus on adding fragments for specific applications and use cases like "what Docker needs".
Kernel build should be decoupled from image build, to reduce unnecessary rebuilding.
Initramfs is unpacked from cpio, which doesn't support SELinux. So they build an initramfs into the kernel, and add a separate initramfs containing a squashfs image which the initramfs code will switch to.
Speaker: Don Zickus, working on RHEL at Red Hat.
Lots of discussion about whether config can be shared upstream, but no agreement on that.
Kyle McMartin(?): Everyone does the hierarchical config layout - like generic, x86, x86-64 - can we at least put this upstream?
Speaker: Matthias Männich, working on Android kernel at Google.
Why does Android need it?
Project Treble made most of Android user-space independent of device. Now they want to make the kernel and in-tree modules independent too. For each kernel version and architecture there should be a single ABI. Currently they accept one ABI bump per year. Requires single kernel configuration and toolchain. (Vendors would still be allowed to change configuration so long as it didn't change ABI - presumably to enable additional drivers.)
ABI stability is scoped - i.e. they include/exclude which symbols need to be stable.
ABI is compared using libabigail, not genksyms. (Looks like they were using it for libraries already, so now using it for kernel too.)
Q: How we can ignore compatible struct extensions with libabigail?
A: (from Dodji Seketeli, main author) You can add specific "suppressions" for such additions.
Speaker: Guillaume Tucker from Collabora.
KernelCI currently builds arbitrary branch with in-tree defconfig or small config fragment.
Some in audience questioned whether building a package was necessary.
Possible further improvements:
Seems like a pretty close match. Adding support for different use-cases is healthy for KernelCI project. It will help distro kernels stay close to upstream, and distro vendors will then want to contribute to KernelCI.
Someone pointed out that this is not only useful for distributions. Distro kernels are sometimes used in embedded systems, and the system builders also want to check for regressions on their specific hardware.
Q: (from Takashi Iwai) How long does testing typically takes? SUSE's full automated tests take ~1 week.
A: A few hours to build, depending on system load, and up to 12 hours to complete boot tests.
Speaker: Alice Ferrazzi of Gentoo.
Gentoo wants to provide safe, tested kernel packages. Currently testing gentoo-sources and derived packages. gentoo-sources combines upstream kernel source and "genpatches", which contains patches for bug fixes and target-specific features.
Testing multiple kernel configurations - allyesconfig, defconfig, other reasonable configurations. Building with different toolchains.
Tests are implemented using buildbot. Kernel is installed on top of a Gentoo image and then booted in QEMU.
Generalising for discussion:
Don Zickus talked briefly about Red Hat's experience. They eventually settled on Gitlab CI for RHEL.
Some discussion of what test suites to run, and whether they are reliable. Varying opinions on LTP.
There is some useful scripting for different test suites at https://github.com/linaro/test-definitions.
Tim Bird talked about his experience testing with Fuego. A lot of the test definitions there aren't reusable. kselftest currently is hard to integrate because tests are supposed to follow TAP13 protocol for reporting but not all of them do!
Speaker: George Kennedy, working on virtualisation at Oracle.
Which distros are using syzkaller? Apparently Google uses it for Android, ChromeOS, and internal kernels.
Oracle is using syzkaller as part of CI for Oracle Linux. "syz-manager" schedules jobs on dedicated servers. There is a cron job that automatically creates bug reports based on crashes triggered by syzkaller.
Google's syzbot currently runs syzkaller on GCE. Planning to also run on QEMU with a wider range of emulated devices.
How to make syzkaller part of distro release process? Need to rebuild the distro kernel with config changes to make syzkaller work better (KASAN, KCOV, etc.) and then install kernel in test VM image.
How to correlate crashes detected on distro kernel with those known and fixed upstream?
Example of benefit: syzkaller found regression in rds_sendmsg, fixed upstream and backported into the distro, but then regressed in Oracle Linux. It turned out that patches to upgrade rds had undone the fix.
syzkaller can generate test cases that fail to build on old kernel versions due to symbols missing from UAPI headers. How to avoid this?
Q: How often does this catch bugs in the distro kernel?
A: It doesn't often catch new bugs but does catch missing fixes and regressions.
Q: Is anyone checking the syzkaller test cases against backported fixes?
A: Yes [but it wasn't clear who or when]
Google has public database of reproducers for all the crashes found by syzbot.
Other possible types of fuzzing (mostly concentrated on KVM):