Systemd-boot and Full Disk Encryption with TPM and FIDO2
20. Dec 2023 | Alberto Planas | CC-BY-SA-3.0
Systemd-boot and Full Disk Encryption in Tumbleweed and MicroOS
openSUSE Tumbleweed and MicroOS are now delivering an image that is
using systemd-boot as boot loader and full disk encryption based
also on systemd. The unlock of the encrypted device can be done via
the traditional password, a TPM2 (a crypto-device that is already
present in your system) that will attach the device if the system is
in good health, or a FIDO2 key that will validate the ownership of a
token.
There is a lot to explain here, but basically those changes are in the direction of moving the distribution into a more safe place. For one side is making the design of the distribution much more simple, and for another it is following the current trends about security that other distributions are also aligning with.
So, lets start with the beginning …
systemd-boot
We all know and love GRUB2. It is a good boot loader. It is also
big, complex, rich, massive and tends to move slow on the development
side.
The openSUSE package for this boot loader contains more than 200 patches. Some of those patches are there for the last 5, 6 … 10 years. That is both an indication of the talent of the maintainers, but also can signal an issue in how slow the upstream contribution process can be.
GRUB2 supports all the relevant systems, including mainframes, arm
or powerpc. Multiple types of file systems, including btrfs or
NTFS. It contains a full network stack, an USB stack, a terminal,
can be scripted … In some sense, it is almost a mini OS by itself.
But then UEFI happened 18 years ago, making almost all the features
provided by GRUB2 somehow redundant. The system firmware was
already providing most of these functionalities as services that can
be consumed by the operating system, the boot loader or any other user
provided application. And of course GRUB2 supported UEFI too.
Soon the Linux kernel gained the option of being compiled as an EFI
binary, via a stub that can be attached to the kernel code. This
implies that the kernel itself could be launched by the firmware
directly, making the boot loader something optional in most of the
cases.
Over time new and more straightforward boot loaders focused on UEFI
appeared, like gummiboot. Later this code was integrated into
systemd and renamed as systemd-boot.
The code is very simple. Many orders of magnitude simpler than
GRUB2. It is basically a very small EFI binary that presents a
menu with the different boot loader entries (text files described in
the Boot Loader Specification or BLS for short), and a call to
the UEFI LoadImage function to delegate the execution to the
selected kernel.
This boot loader can also work with the new unified kernel images
(UKI), that are files that aggregate in a single unit the kernel,
the command line, and the initrd. Those UKIs can be very handy
for image based distributions, and openSUSE plans to support them as
well.
Providing systemd-boot as an alternative for GRUB2 is something
that openSUSE wanted to do for a long time. In August 2023 there was
an announcement on the Factory mailing list about Tumbleweed
supporting systemd-boot.
The announcement references a wiki entry that explains how to
migrate an installation using GRUB2 to systemd-boot manually.
Soon after the announcement, yast-bootloader gained support for
it for new installations.
Supporting another boot loader comes with a cost. As argued, the code
base is smaller, with less bugs and more easy to reason about. But
the UEFI dependency decreases the amount of supported architectures
(x86-64 and aarch64). That problem can be very much alleviated by
providing another patch for GRUB2 to support the BLS entries, so
the architecture of the distribution after the boot loader can be
independent of the boot loader itself. The good news is that the
patch already exists, and could potentially be added into the package.
Another problem is that systemd-boot does not speak btrfs. As an
EFI binary, it can read files only from a FAT32 file system. This
limitation can be resolved by moving the kernel and the initrd into
the EFI system partition (ESP).
Finally, there is also the consideration of supporting snapshots in
Tumbleweed and transactions in MicroOS. From the boot loader the user
should be able to select what snapshot to boot from, like it is
actually possible to do when using GRUB2. Both concepts are
implemented using btrfs subvolumes, and there is only a subset of
kernel, command line, initrd combinations that are valid for each of
those subvolumes.
For example, let’s say we have two snapshots in our system, and each of these represents a system that has two kernels installed. It is possible that those two kernels are not the same across all the snapshots. Maybe one of the upgrades replaced one kernel with a newer version. We need some tool that can do the bookkeeping required to associate the correct combination that will produce a successful boot into any of those snapshots, creating the boot entries under those restrictions.
This tool is sdbootutil. Every time snapper creates or
destroys a snapshot (for example, when the system gets updated), it
will call this tool that will analyze the content of the snapshots,
making sure that the corresponding kernel is installed in the ESP, a
valid initrd for this kernel is present (if not it will be created
calling mkinitrd) and a boot entry is created that connects the
kernel, the initrd and the snapshot via the command line. It also
takes care of other details, like checking the free space on the
partition.
Usually his process works transparently, but is good to remember that we can force a clean state with:
sdbootutil add-all-kernels
sdbootutil remove-all-kernels
Just in case, you know …
Full disk encryption
The other aspect that we want to announce is the support of full disk
encryption (FDE) based on systemd.
FDE is not the new kid on the block. GRUB2 could unlock LUKS
volumes since long ago using the cryptomount command. Traditionally
this will request the password from the user two times: once when the
boot loader does the unlock and again when the initrd does the same
later. There are ways to avoid the second request injecting the
password into the initrd or, if you are using the openSUSE package,
it will inject the password transparently into the initrd.
Recently GRUB2 gained two new features: partial support of LUKS2
encrypted devices (using PBKDF2 as key derivation function instead
of the more secure and recommended Argon2id) and a key protection
mechanism that can store secrets in devices like the TPM2.
TPM2
Explaining how TPM2 works in detail is a topic for another post, but
for now we can think of it as a crypto device that be used to unlock
secrets only when certain conditions related to the state of the
system are met. The TPM2 will unlock the secret if the system is in
a healthy state.
This term is a technical one, and is related to assert that the system
is in a known good state. In other words, we know for sure that
the firmware has not been tampered with, the boot loader is the one
that we installed and has not been replaced, that the kernel is
exactly the one that comes from the distribution, that the kernel
command line is the one that we expect, and that the initrd that we
used does not contain any extra binary that we do not control.
Internally the TPM2 has some registers, known as platform
configuration register (PCR). In the TPM2 specification there are
24 of them and the size of one is enough to store the value of a hash
function, like SHA1 or SHA256. They are separated by banks: one
per supported hash function, but this is too much detail for now.
Those registers are kind of special. We can reset them, usually
setting the value to 0. We can read the value, or we can “extend”
them. The write operation is designed in a way that we cannot set any
random value in the register, except the result of the associated hash
function concatenating the current PCR value and a new value
provided by the user.
The current value of the PCR can only be produced by extending this
register using exactly the same sequence of values. If we change even
one bit of one of the values, we will produce a wildly different final
result for the same PCR.
This feature is used in a process known as “measured boot”, where
each stage in the boot chain is measured before it is executed. This
means that before the initial stages of the firmware are running,
there is a process that will calculate the hash of the code in memory,
and extend one of the PCRs using this value. This is repeated until
the very end of the boot sequence: the kernel and the initrd.
When measured boot is in place, the final values of the first 10
PCRs will contain values than can only be predicted if the machine
is using a well known version of firmware, boot loader and kernel,
together with the associated data like certificates, configuration
files, or kernel parameters. If one of those elements change (for
example, by using a different secure boot certificate), it will
generate PCR values different from the ones that we expect.
TPM2 chips are very interesting devices, and the set of features go
far beyond measured boot. If you want to learn more I recommend
resources like this or this.
TPM2 for FDE
Anyway, the gist here is that we can create a “policy” that can
instruct the TPM2 to decrypt a secret only if certain PCRs
contains the expected values. The details are a bit different, but
for now lets use this model as a good first approximation.
The idea is that we can encrypt a password with the values of certain
PCR registers, so GRUB2 can later attach the LUKS2 device if the
TPM2 can recover the password, validating the health of the system
until this point. If the TPM2 fails to decrypt it, that would mean
that some PCR has not the expected value and some stage in the boot
process changed. In this situation GRUB2 will ask the password from
the user to continue loading the kernel and the rest of the system. It
delegates the trust about the new state to the user.
GRUB2 also provides a tool to seal secrets under the current values
of a subset of PCRs. This is nice but also presents several
problems. One is that maybe we are setting the system up in a way
that we know the PCRs values will change during the next boot (for
example, during the first installation, a boot loader upgrade or a
firmware update). In this case sealing the password using the current
register values is useless: we need to be able to predict the new ones
and use those hypothetical values to do the sealing.
The other problem is more insidious and will become critical later.
The expected values can change frequently and can not be unique.
Maybe there is a set of valid ones. We can choose to boot from a
different kernel or from a different snapshot. The TPM2 provides a
solution for this using something known as authorized policies. They
are a way of creating policies that can change, but they are validated
by a signature. In essence, we create a public and a private key, and
we create multiple PCR policies that are signed using the private
key. Now the TPM2 can validate the signature using the public part,
and unseal the secret using the PCRs values stored in the new
policy.
Since early 2023 openSUSE provides the pcr-oracle tool to help
with the prediction of the PCR registers, and encrypt a key under
those values using both PCR policies or authorized policies. Using
this tool we can now seal a secret under a set of PCRs values that
can change!
In the openSUSE wiki we can find more documentation about those topics, including instructions about how to use it in our installation.
Using systemd for disk encryption
With GRUB2 the FDE is working properly, so why look for something
else? One reason is very evident: this architecture can only work
… well … only if our openSUSE GRUB2 version is used. It will
not work for other boot loaders like systemd-boot. In fact it will
not work with the the upstream version of GRUB2 itself.
But there is a second reason: we can argue that there is not a full
measured boot in place with GRUB2. If the boot loader needs to
unlock the device before it can load the kernel, is natural that
the PCR policies that will evaluate the health of the system cannot
make asserts on the kernel, command line or initrd that will be
used. Those will be loaded after the LUKS2 device has been opened.
The use of systemd-boot gives us an alternative architecture for
FDE that can work properly with any boot loader that follows the
BLS (remember, there is a patch for GRUB2 to support it somewhere,
so it is not excluded a priori), and provides the chance to do a
full measured boot attestation before unlocking the device.
One difference is that the kernel and the initrd will be placed in
the unencrypted ESP, and the unlock of the sysroot will be done
from inside the initrd using the different options that
systemd-cryptsetup offers. Currently it can unlock the device using
a normal password, a TPM2 with authorized policies (with optionally
a PIN that must be entered by the user) or a FIDO2 key device. In
the /etc/crypttab file we need to describe the unlocking
mechanism.
pcr-oracle has been extended to support the creation of authorized
policies that systemd can understand. They are stored in a JSON
file that contains multiple predictions, each one of them indicating
the PCRs involved, the TPM2 policy hash, the fingerprint of the
public key and the signature of the policy. This, together with the
public key PEM file, composes all the data required for
systemd-cryptsetup to use the TPM2 for the unseal of the LUKS2
key.
The RSA 2048 key used to sign the policy can be created with
openssl or with pcr-oracle itself. A note of caution: if the
private key gets leaked, this is a game over for the expected security
that the TPM2 could provide. Luckily the solution is cheap in this
case: generate a new key, re-register the key in the LUKS2 key slot
with systemd-cryptenroll and use sdbootutil to regenerate the
predictions for each boot entry. Yeah … we will document all the
process in the “systemd-fde” wiki page and provide better tools,
but trust me, it is indeed a cheap operation.
openSUSE is providing a MicroOS image named kvm-and-xen-sdboot that shows how all of this is working. This image contains some of the already mentioned tools integrated and some other new ones:
systemd-boot: Boot loader used instead of the defaultGRUB2sdbootutil: Helper scripts to synchronize the boot entries of the systempcr-oracle: Predict thePCRs values for the next boot, and creates the authorized policies forsystemddisk-encryption-tool: Encrypt the device wheresysrootis located on the first bootdracut-pcr-signature:dracutmodule that will load the predictions into theinitrdfrom theESP
Those tools are designed to work together for this new FDE
architecture. What follows is a brief description on how all is
connected.
Once we get the new MicroOS qcow2 image and we setup the VM, we can
proceed with the boot process. If the VM has a virtual TPM2 device
it will start measuring the executed code and data, extending the
corresponding PCRs. Once systemd-boot has been reached, it will
find the correct boot entry for this session and will read the
corresponding kernel and initrd from it.
At this moment the image is not encrypted. Inside the initrd that
is used during this first boot, the disk-encryption-tool script will
be called. Using some heuristics it will find the partition that
belongs to sysroot (where the system is located), and will resize it
to reserve 32MB for the LUKS2 header. After that it will use all
the magic that cryptsetup provides to re-encrypt the device using a
locally generated password. This password, as of today, corresponds
to the recovery key that will be presented to the user at the end and
the user should take note and keep it safe.
After the re-encryption, the system /etc/crypttab will be updated to
communicate that this device is now encrypted and should be managed
with different tools later.
At the end of the initrd we switch to the new sysroot, now finally
located in an encrypted device. The disk-encryption-tool script
already did its main job, but it installed two modules for
jeos-firstboot, that will be executed on the first boot of the
system, which is currently happening!
The first module, enroll, will detect if there is a FIDO2 key
inserted and a TPM2 available. If so it will present a dialog
asking what do you want to use to unlock the system. The second module
will ask the user if the root password will also be enrolled in the
LUKS2 header as a new key, and will show the recovery key generated
earlier.
As of today it is not advisable to register both. As we described
earlier the FIDO2 key will make more sense if we are using a laptop
or a desktop machine and we want unlock the encrypted device with
proof of a token that we own. This is an interactive process. The
TPM2 makes more sense on situations where we do not want to interact
with the system, and we want to automatically unlock the device only
if we can assert the health of the system (no tamper occured in the
boot chain).
If we register the FIDO2 key, systemd-cryptenroll will be called
and we will be asked to press the button two times and the
installation process will be over. At the next boot we will be
required to present the key, and if the key is missing, the recovery
password will be asked.
If we register the TPM2 device, a new RSA 2048 key gets generated
and stored (the public and private parts) in /etc/systemd and
systemd-cryptenroll will be used to enroll the public key and to
annotate the PCRs that are used in the sealing of the LUKS2 key.
By default we will be using 0, 2, 4, 7, and 9. You can check the
meaning in this reference. PCRs 0 and 2 will measure all the
UEFI firmware code. PCR 4 will measure the boot loader
(systemd-boot) and the kernel (also UEFI binaries). PCR 7 will
register all the secure boot certificates, and PCR 9 will be used by
the kernel to measure the command line and the initrd.
This covers pretty much all that can make sense, but it is the user
who has the final word on what to measure. The reason is that the
predictions are done inside sdbootutil that, remember, will be
automatically executed after each change in the system (updates,
package removal, snapshots management, etc), and this tool will
produce predictions only for the PCRs registered in the LUKS2
header.
Regardless of the selected unlocking mechanism, the /etc/crypttab
file will be updated with this selection and a new initrd will be
generated to contain this information for the next boot.
Finally, the last component, dracut-pcr-signature will be
responsible that during the subsequent boots all the information that
systemd-cryptsetup requires for the unlock will be present
“on-the-fly” inside the initrd. It should be noted that the initrd
will require the JSON file with the policies and the key, but those
cannot be included in the initrd! The moment that we make a
prediction of a PCR that is extended with the hash of the initrd,
that is all, and we cannot touch the initrd anymore as this would
produce a new hash and automatically will invalidate the prediction.
This dracut module will be executed before the systemd-cryptsetup
generator for any encrypted devices has started, and will search in
the ESP partition for a tpm2-pcr-signature.json file, that
contains all the valid prediction for the current boot. Once this
file is in place, the systemd-crypsetup will be able to assert the
device in the current state is the expected one and the boot process
can continue until the end.
Future
The image is here, and is a sound PoC. It provides a much more simple
architecture and will place some components in the correct place.
This will help a lot in the next stages, as there are some other
things that we want to do with the distribution in relation to FDE.
One pretty clear disk-encryption-tool has limited use outside image
based installation. Part of this code should be living in YaST and
in Agama. The installer is already creating LUKS2 devices, so it
should be “easy” to extend it in a way that works for us.
Ideally, the jeos-firstboot modules should also live in the
installer, but somehow they make sense here too. In any case the
functionality should not be separated, and both should be merged.
The encryption tool is doing something right from the very start: the
master key, together with all the user keys are generated during
installation time, but one possible improvement is generating the
recovery key a bit later using the systemd tools. It is a small
detail, but separating system keys from users keys can simplify the
architecture.
Another aspect to improve is that the user may want to use the TPM2
and the FIDO2 key at the same time. For example, by default the
TPM2 is used, and if the stage changed in a way that fails the
prediction (or there is a security breach that has been detected), the
user can delegate the unlock to the FIDO2 key, instead of using a
password.
The sdbootutil script contains a bunch of features that should be
also living in systemd. Working with upstream will make this tool
obsolete with time, which would be more good news.
Another improvement that we can help with in systemd is to improve
the diagnosis about the reasons making the TPM2 reject the unseal of
the LUKS2 key. Today we have a general fail message without
reporting what PCR or what measured component inside the PCR is
reporting a different hash than the one predicted. This will help a
lot understating what did go wrong. Was the boot loader changed? Or
something in the firmware?
pcr-oracle is a very good tool for predicting the next PCR values.
It was very easy to extend to parse the new events in the log related
with the full measured boot process, including the kernel,
systemd-boot extensions on PCR 12, or generating the JSON
document required by systemd. The new systemd 255 (released a
week ago from the time of writing this) includes a similar tool named
systemd-pcrlock that can help us in providing the improved diagnosis
that we are looking for. Evaluating this tool to do the predictions
will be done soon too.
As today Type#1 and Type#2 entries from the BLS are not isomorphic.
There are sections in the EFI file of the UKI format that do not
exist in the text representation. Maybe we will decide to use UKIs
in the future, or maybe not. So a good improvement is working on
helping with this unification, that will (among other things) provide
a standard way of splitting the JSON file and associating the
predictions to each boot loader entry.
Generating and registering a new key, or selecting a different set of
PCRs is today a manual process. The current tools can be extended
to help in those processes, or better documentation could be provided.
The new approach for FDE is not about excluding GRUB2 from the
equation. It is about providing a chance of using different boot
loaders that follows the BLS. Validating that a proper patched
(duh!) GRUB2 can work with all this is still something to be done.
Also, another thing that needs to be validated and improved are
installations with multiple encrypted disks. In principle the design
and the code is supporting it (even when the PCR registers per
volume are different). openQA will do wonders here.
And finally, we should rethink if the UKIs do make sense for
openSUSE or not. If we go in that direction, the private key used for
signing the policies will be kept in OBS and those policies will
also be generated in the build service, using a different set of PCR
values.
In any case, there is a bunch of work ahead of us.
Categories: blog Announcements openSUSE
Tags: openSUSE Community Security Full Disk Encryption FDE MicroOS Tumbleweed Rolling Release systemd