Sausage Factory: An introduction to building modules in Fedora

First off, let me be very clear up-front: normally, I write my blog articles to be approachable by readers of varying levels of technical background (or none at all). This will not be one of those. This will be a deep dive into the very bowels of the sausage factory.

This blog post assumes that the reader is aware of the Fedora Modularity Initiative and would like to learn how to build their very own modules for inclusion into the Fedora Project. I will guide you through the creation of a simple module built from existing Fedora Project packages on the “F26” branch.

To follow along, you will need a good working knowledge of the git source-control system (in particular, Fedora’s “dist-git“) as well as being generally comfortable around Fedora system tools such as dnf and python.

Setting up the Module Build Service

For the purposes of this blog, I am going to use Fedora 25 (the most recent stable release of Fedora) as the host platform for this demonstration and Fedora 26 (the current in-development release) as the target. To follow along, please install Fedora 25 Server on a bare-metal or virtual machine with at least four processors and 8 GiB of RAM.

First, make sure that the system is completely up-to-date with all of the latest packages. Then we will install the “module-build-service” package. We will need version 1.3.22 or later of the module-build-service RPM and version 1.2.0 or later of python2-modulemd, which at the time of this writing is in the “updates-testing” repository.

dnf install --enablerepo=updates-testing module-build-service python2-modulemd

This may install a considerable number of dependency packages as well. Once this is installed, I recommend modifying /etc/module-build-service/config.py to change NUM_CONSECUTIVE_BUILDS to match the number of available processors on the system. (And yes, this option should be NUM_CONCURRENT_BUILDS, but we’re stuck with it for the moment. Edit 2017-06-05: this has been fixed upstream and will be in version 1.3.24 of the module-build-service).

Leave the rest of the options alone at this time. The default configuration will interact with the production Fedora Project build-systems and is exactly what we want for the rest of this tutorial.

Gathering the module dependencies

So now that we have a build environment, we need something to build. For demonstration purposes, I’m going to build a module to provide the libtalloc library used by the Samba and SSSD projects. This is obviously a trivial example and would never become a full module on its own.

The first thing we need to do is figure out what runtime and build-time dependencies this package has. We can use dnf repoquery to accomplish this, starting with the runtime dependencies:

dnf repoquery --requires libtalloc.x86_64 --resolve

Which returns with:

glibc-0:2.25-4.fc26.i686
glibc-0:2.25-4.fc26.x86_64
libcrypt-0:2.25-4.fc26.x86_64
libcrypt-nss-0:2.25-4.fc26.x86_64

There are two libcrypt implementations that will satisfy this dependency, so we can pick one a little later. For glibc, we only want the one that will operate on the primary architecture, so we’ll ignore the .i686 version.

Next we need to get the build-time dependencies with:

dnf repoquery --requires --enablerepo=fedora-source --enablerepo=updates-source libtalloc.src --resolve

Which returns with:

docbook-style-xsl-0:1.79.2-4.fc26.noarch
doxygen-1:1.8.13-5.fc26.x86_64
libxslt-0:1.1.29-1.fc26.i686
libxslt-0:1.1.29-1.fc26.x86_64
python2-devel-0:2.7.13-8.fc26.i686
python2-devel-0:2.7.13-8.fc26.x86_64
python3-devel-0:3.6.1-6.fc26.i686
python3-devel-0:3.6.1-6.fc26.x86_64

OK, that’s not bad. Similar to the runtime dependencies above, we will ignore the .i686 versions. So now we have to find out which of these packages are provided already by the base-runtime module or the shared-userspace module, so we don’t need to rebuild them. Unfortunately, we don’t have a good reference location for getting this data yet (it’s coming a little ways into the future), so for the time being we will need to look directly at the module metadata YAML files:

When reading the YAML, the section that we are interested in is the api->rpms section. This part of the metadata describes the set of packages whose interfaces are public and can be consumed directly by the end-user or other modules. So, looking through these two foundational modules, we see that the base-runtime provides glibc, libcrypt and python3-devel and shared-userspace provides docbook-style-xsl, libxslt​ and python2-devel and common-build-dependencies provides doxygen. So in this case, all of the dependencies are satisfied by these three core modules. If they were not, we’d need to recurse through the dependencies and figure out what additional packages we would need to include in our module to support libtalloc or see if there was another module in the collection that provided it.

So, the next thing we’re going to need to do is decide which version of libtalloc we want to package. What we want to do here is check out the libtalloc module from Fedora dist-git and then find a git commit has that matches the build we want to add to our module. We can check out the libtalloc module by doing:

fedpkg clone rpms/libtalloc && cd libtalloc

Once we’re in this git checkout, we can use the git log command to find the commit hash that we want to include. For example:

[sgallagh@sgallagh540:libtalloc (master)]$ git log -1
commit f284a27d9aad2c16ba357aaebfd127e4f47e3eff (HEAD -> master, origin/master, origin/f26, origin/HEAD)
Author: Lukas Slebodnik <lslebodn@redhat.com>
Date: Tue Feb 28 09:03:05 2017 +0100

New upstream release - 2.1.9
 
 rhbz#1401225 - Rename python packages to match packaging guidelines
 https://fedoraproject.org/wiki/Changes/Automatic_Provides_for_Python_RPM_Packages

The string of hexadecimal characters following the word “commit” is the git commit hash. Save it somewhere, we’re going to need it in the next section.

Creating a new module

The first thing to be aware of is that the module build-service has certain constraints. The build can only be executed from a directory that has the same name as the module and will look for a file named modulename.yaml in that directory. So in our case, I’m going to name the module talloc, which means I must create a directory called talloc and a module metadata file called talloc.yaml. Additionally, the module-build-service will only work within a git checkout, so we will initialize this directory with a blank metadata file.

mkdir talloc && cd talloc
touch talloc.yaml
git init
git add talloc.yaml
git commit -m "Initial setup of the module"

Now we need to edit the module metadata file talloc.yml and define the contents of the module. A module metadata file’s basic structure looks like this:

document: modulemd
version: 1
data:
  summary: Short description of this module
  description: Full description of this module
  license:
    module:
    - LICENSENAME
  references:
    community: Website for the community that supports this module
    documentation: Documentation website for this module
    tracker: Issue-tracker website for this module
  dependencies:
    buildrequires:
      base-runtime: f26
      shared-userspace: f26
      common-build-dependencies: f26
    requires:
      base-runtime: f26
      shared-userspace: f26
  api:
    rpms:
    - rpm1
    - ...
  filter:
    rpms:
    - filteredrpm1
    - ...
  components:
    rpms:
      rpm1:
        rationale: reason to include rpm1
        ref:

Let’s break this down a bit. First, the document type and version are fixed values. These determine the version of the metadata format. Next comes the “data” section, which contains all the information about this module.

The summary, description and references are described in the sample. The license field should describe the license of the module, not its contents which carry their own licenses.

The apisection is a list of binary RPMs that are built from the source RPMs in this module whose presence you want to treat as “public”. In other words, these are the RPMs in this module that others can expect to be available for their use. Other RPMs may exist in the repository (to satisfy dependencies or simply because they were built as a side-effect of generating these RPMs that you need), but these are the ones that consumers should use.

On the flip side of that, we have the filter section. This is a place to list binary RPM packages that explicitly must not appear in the final module so that no user will try to consume them. The main reason to use this would be if a package builds a subpackage that is not useful to the intended audience and requires additional dependencies which are not packaged in the module. (For example, a module might contain a package that provides a plugin for another package and we don’t want to ship that other package just for this reason).

Each of the components describes a source RPM that will be built as part of this module. The rationale is a helpful comment to explain why it is needed in this module. The ref field describes any reference in the dist-git repository that can be used to acquire these sources. It is recommended to use an exact git commit here so that the results are always repeatable, but you can also use tag or branch names.

So our talloc module should look like this:

document: modulemd
version: 1
data:
  summary: The talloc library
  description: A library that implements a hierarchical allocator with destructors.
  stream: ''
  version: 0
  license:
    module:
    - LGPLv3+
  references:
    community: https://talloc.samba.org/
    documentation: https://talloc.samba.org/talloc/doc/html/libtalloc__tutorial.html
    tracker: http://bugzilla.samba.org/
  dependencies:
    buildrequires:
      base-runtime: f26
      shared-userspace: f26
      common-build-dependencies: f26
    requires:
      base-runtime: f26
  api:
    rpms:
    - libtalloc
    - libtalloc-devel
    - python-talloc
    - python-talloc-devel
    - python3-talloc
    - python3-talloc-devel
  components:
    rpms:
      libtalloc:
        rationale: Provides a hierarchical memory allocator with destructors
        ref: f284a27d9aad2c16ba357aaebfd127e4f47e3eff

You will notice I omitted the “filter” section because we want to provide all of the subpackages here to our consumers. Additionally, while most modules will require the shared-userspace module at runtime, this particular trivial example does not.

So, now we need to commit these changes to the local git repository so that the module build service will be able to see it.

git commit talloc.yaml -m "Added module metadata"

Now, we can build this module in the module build service. Just run:

mbs-build local

The build will proceed and will provide a considerable amount of output telling you what it is doing (and even more if you set LOG_LEVEL = 'debug' in the /etc/module-build-service/config.py file). The first time it runs, it will take a long time because it will need to download and cache all of the packages from the base-runtime and shared-userspace modules to perform the build. (Note: due to some storage-related issues in the Fedora infrastructure right now, you may see some of the file downloads time out, canceling the build. If you restart it, it will pick up from where it left off and retry those downloads.)

The build will run and deposit results in the ~/modulebuild/builds directory in a subdirectory named after the module and the timestamp of the git commit from which it was built. This will include mock build logs for each individual dependency, which will show you if it succeeded or failed.

When the build completes successfully, the module build service will have created a yum repository in the same results directory as the build logs containing all of the produced RPMs and repodata (after filtering out the undesired subpackages).

And there you have it! Go off and build modules!

Base Runtime and the Generational Core

A Quick Primer on Modularity

lego_chicago_city_view_2001Modularity (formerly, Modularization) is an ongoing initiative in Fedora to resolve the issue of divergent, occasionally conflicting lifecycles of different components. A module provides functionality (such as a web server) and includes well-integrated and well-tested components (such as Apache httpd and the libraries on which it depends). It can be deployed into production in various ways: as “classic” RPM packages or a container image, and is updated as a whole. Different modules can emphasize new features, stability, security, etc. differently.

Modules differ from traditional packaging in certain important ways. Perhaps most importantly, they allow us to separate internal implementation details from the exposed interfaces of the module. Historically in Fedora, if a packager wanted to deliver a new web application, that would also often mean that they needed to package and carry the framework or other libraries used by that application. This tended to be a double-edged sword: on the one hand, those libraries were now available for anyone to pick up and use in Fedora. However, in many cases, this meant that the primary maintainer of that package might actually have no specific knowledge or understanding of it except that its lack would mean their application didn’t work. This can be a problem if a person is carrying around a library for the use of a single helper function and don’t want to be responsible for issues in the rest of the library.

With a modular approach, the module itself will provide a definition of what public interfaces are stable for use by other projects. This way, they can opt to contain an internal-only implementation of some libraries.

A good metaphor for Modularity might be urban planning: sections of the Earth are selected for establishing housing, businesses and other construction projects. Each of these projects would be effectively a module.

Base Runtime: The Bedrock

jackhammers_on_west_bedrock_-_nara_-_294085
The first thing that a construction company would look at when establishing a new building site would be the ground on which it is to be constructed. It is essential that this location be sturdy, reliable and capable of supporting the weight of the projects being built atop it.

In the Fedora Project, the bedrock upon which the other modules will be built is called the Base Runtime. The definition of this project has gone through a number of revisions over the last few months, but at this point it is fairly settled down to this:

The Base Runtime contains the software necessary to boot the system to a running kernel and the runtime libraries for the most rudimentary operation of the system.

In practical terms, this means that the Base Runtime is

  • Not installable by itself. It can only boot to a kernel; it has no init system or running applications.
  • Not self-hosting. It does not contain the packages necessary to rebuild itself from source. Other modules will need to provide this.
  • Limited to packages with an extremely stable public API. The Base Runtime needs to be swappable at any time out from under the running system without impacting the operation of applications running atop it.

System Runtime: Urban Infrastructure

 

los_angeles_-_echangeur_autoroute_110_105Once you have a location chosen and have built some homes and some businesses, you need a certain amount of infrastructure to connect it all together: roads, plumbing, electricity, etc.

In the case of a computer operating system, this means things like service control and monitoring, filesystem operations, command shells and other basic tools for operating on and maintaining the system. In Fedora, this means essentially the bash shell environment, login services and the standard POSIX command utilities.

The primary reason for separating the System Runtime from the Base Runtime is to allow these two modules to carry different API lifecycle guarantees. While the Base Runtime will need to remain backwards compatible for an extended period, it may be permissible for the System Runtime to see revisions at a higher rate (where it makes sense to provide new functionality faster). For example, Fedora may wish to update the systemd project (used for control, monitoring and management of system services) at a much higher rate than the low-level C runtime library.

Shared Components: Building Materials

pexels-photo-12255

In order to build up your city, you naturally need a set of raw materials. Wood, stone, metal and essential workers.

The third and final piece of the puzzle is the Shared Components module. This special module  is comprised of the set of low-level libraries common to both the Base Runtime and the System Runtime. Some of these libraries may be made available for other services to consume, but the majority of them will be held privately within the modules.

Generational Core: Local GovernmentВыступление Михаила Горбачева на сессии Генеральной ассамблеи ООН

After building up the town, it is important to have a mechanism in place for maintaining it, improving it and making sure to adapt to changing conditions in the world around it. In the urban planning world, this would mean that the local government would establish policies and groups capable of performing these tasks. They would be given access to all of the tools used to create the city in the first place and would continue to monitor things and address issues as they arise. This is effectively what the Generational Core is: the tying together of all those disparate components as a single entity.

While the Base Runtime, System Runtime and Shared Components modules will be built separately and maintained for independent lifecycles, they will be delivered to end-users as part of a single combined module stack called the Generational Core. (Defining “Generational” in the sense of “genealogy” as opposed to “creation”).

Unlike the Base Runtime and System Runtime, the Generational Core will be installable and very similar to the “minimal install” of previous releases of Fedora. It will be somewhat more stripped down even than those. For example, the Generational Core does not need to provide network management services, remote login capabilities or network storage connectivity. These features will be provided by additional modules and module stacks built atop the Generational Core.

Sausage Factory: Multiple Edition Handling in Fedora

First off, let me be very clear up-front: normally, I write my blog articles to be approachable by readers of varying levels of technical background (or none at all). This will not be one of those. This will be a deep dive into the very bowels of the sausage factory.

The Problem

Starting with the Fedora.next initiative, the Fedora Project embarked on a journey to reinvent itself. A major piece of that effort was the creation of different “editions” of Fedora that could be targeted at specific user personas. Instead of having a One-Size-Fits-Some Fedora distribution, instead we would produce an operating system for “doers” (Fedora Workstation Edition), for traditional infrastructure administrators (Fedora Server Edition) and for new, cloudy/DevOps folks (Fedora Cloud Edition).

We made the decision early on that we did not want to produce independent distributions of Fedora. We wanted each of these editions to draw from the same collective set of packages as the classic Fedora. There were multiple reasons for this, but the most important of them was this: Fedora is largely a volunteer effort. If we started requiring that package maintainers had to do three or four times more work to support the editions (as well as the traditional DIY deployments), we would quickly find ourselves without any maintainers left.

However, differentiating the editions solely by the set of packages that they deliver in a default install isn’t very interesting. That’s actually a problem that could have been solved simply by having a few extra choices in the Anaconda installer. We also wanted to solve some classic arguments between Fedora constituencies about what the installed configuration of the system looks like. For example, people using Fedora as a workstation or desktop environment in general do not want OpenSSH running on the system by default (since their access to the system is usually by sitting down physically in front of a keyboard and monitor) and therefore don’t want any potential external access available. On the other hand, most Fedora Server installations are “headless” (no input devices or monitor attached) and thus having SSH access is critical to functionality. Other examples include the default firewall configuration of the system: Fedora Server needs to have a very tightened default firewall allowing basically nothing in but SSH and management features, whereas a firewall that restrictive proves to be harmful to usability of a Workstation.

Creating Per-Edition Default Configuration

The first step to managing separate editions is having a stable mechanism for identifying what edition is installed. This is partly aesthetic, so that the user knows what they’re running, but it’s also an important prerequisite (as we’ll see further on) to allowing the packaging system and systemd to make certain decisions about how to operate.

The advent of systemd brought with it a new file that describes the installed system called os-release. This file is considered to be authoritative for information identifying the system. So this seemed like the obvious place for us to extend to include information about the edition that was running as well. We therefore needed a way to ensure that the different editions of Fedora would produce a unique (and correct) version of the os-release file depending on the edition being installed. We did this by expanding the os-release file to include two new values: VARIANT and VARIANT_ID. VARIANT_ID is a machine-readable unique identifier that describes which version of Fedora is installed. VARIANT is a human-readable description.

In Fedora, the os-release file is maintained by a special RPM package called fedora-release. The purpose of this package is to install the files onto the system that guarantee this system is Fedora. Among other things, this includes os-release, /etc/fedora-release, /etc/issue, and the systemd preset files. (All of those will become interesting shortly).

So the first thing we needed to do was modify the fedora-release package such that it included a series of subpackages for each of the individual Fedora editions. These subpackages would be required to carry their own version of os-release that would supplant the non-edition version provided by the fedora-release base package. I’ll circle back around to precisely how this is done later, but for now accept that this is true.

So now that the os-release file on the system is guaranteed to contain appropriate VARIANT_ID, we needed to design a mechanism by which individual packages could make different decisions about their default configurations based on this. The full technical details of how to do this are captured in the Fedora Packaging Guidelines, but the basic gist of it is that any package that wants to behave differently between two or more editions must read the VARIANT_ID from os-release during its %posttrans (post-transaction) phase of package installation and place a symlink to the correct default configuration file in place. This needs to be done in the %posttrans phase because, due to the way that yum/dnf processes the assorted RPMs, there is no other way to guarantee that the os-release file has the right values until that time. This is because it’s possible for a package to install and run its %post script between the time that the fedora-release and fedora-release-EDITION package gets installed.

That all assumes that the os-release file is correct, so let’s explore how that is made to happen. First of all, we created a new directory in /usr/lib called /usr/lib/os.release.d/ which will contain all of the possible alternate versions of os-release (and some other files as well, as we’ll see later). As part of the %install phase of the fedora-release package, we generate all of the os-release variants and then drop them into os.release.d. We will then later symlink the appropriate one into /usr/lib/os-release and /etc/os-release during %post.

There’s an important caveat here: the /usr/lib/os-release file must be present and valid in order for any package to run the %systemd_post scripts to set up their unit files properly. As a result, we need to take a special precaution. The fedora-release package will always install its generic (non-edition) os-release file during its %post section, to ensure that the %systemd_post scripts will not fail. Then later if a fedora-release-EDITION package is installed, it will overwrite the fedora-release one with the EDITION-specific version.

The more keen-eyed reader may have already spotted a problem with this approach as currently described: What happens if a user installs another fedora-release-EDITION package later? The short answer was that in early attempts at this: “Bad Things Happened”. We originally had considered that installation of a fedora-release-EDITION package atop a system that only had fedora-release on it previously would result in converting the system to that edition. However, that turned out to A) be problematic and B) violate the principle of least surprise for many users.

So we decided to lock the system to the edition that was first installed by adding another file: /usr/lib/variant which is essentially just a copy of the VARIANT_ID line from /etc/os-release. In the %post script of each of the fedora-release subpackages (including the base subpackage), it is checked for its contents. If it does not exist, the %post script of a fedora-release-EDITION package will create it with the appropriate value for that edition. If processing reaches all the way to the %posttrans script of the fedora-release base package (meaning no edition package was part of the transaction), then it will write the variant file at that point to lock it into the non-edition variant.

There remains a known bug with this behavior, in that if the *initial* transaction actually includes two or more fedora-release-EDITION subpackages, whichever one is processed first will “win” and write the variant. In practice, this is effectively unlikely to happen since all of the install media are curated to include at most one fedora-release-EDITION package.

I said above that this “locks” the system into the particular release, but that’s not strictly true. We also ship a script along with fedora-release that will allow an administrator to manually convert between editions by running `/usr/sbin/convert-to-edition -e <edition>`. Effectively, this just reruns the steps that the %post of that edition would run, except that it skips the check for whether the variant file is already present.

Up to now, I’ve talked only about the os-release file, but the edition-handling also addresses several other important files on the system, including /etc/issue and the systemd presets. /etc/issue is handled identically to the os-release file, with the symlink being created by the %post scripts of the fedora-release-EDITION subpackages or the %posttrans of the fedora-release package if it gets that far.

The systemd presets are a bit of a special case, though. First of all, they do not replace the global default presets, but the do supplement them. This means that what we do is symlink in an edition-specific preset into the /usr/lib/systemd/system-preset/ directory. These presets can either enable new services (as in the Server Edition, where it turns on Cockpit and rolekit) or disable them (as in Workstation Edition where it shuts off OpenSSH). However, due to the fact that systemd only processes the preset files during its %post phase, we need to force systemd to reread them after we add the new values.

We need to be careful when doing this, because we only want to apply the new presets if the current transaction is the initial installation of the fedora-release-EDITION package. Otherwise, an upgrade could override choices that the user themselves have made (such as disabling a service that defaults to enabled). This could lead to unexpected security issues, so it has to be handled carefully.

In this implementation, instead of just calling the command to reprocess all presets, we instead parse the preset files and just process only those units that are mentioned in them. (This is to be overcautious in case any other package is changing the default enabled state besides systemd, such as some third-party RPMs that might have `systemctl enable httpd.service` in their %post section, for example.)

Lastly, due to the fact that we are using symlinks to manage most of this, we had to write the %post and %posttrans scripts in the built-in Lua implementation carried by RPM. This allowed us to call posix.symlink() without having to add a dependency on coreutils to do so in bash (which resulted in a circular dependency and broken installations). We wrote this as a single script that is imported by the RPM during the SRPM build phase. This script is actually coped by rpmbuild into the scriptlet sections verbatim, so the script must be present in the dist-git checkout on its own and not even as part of the exploded tarball. So when modifying the Lua script, it’s important to make sure to modify the copy in dist-git as well as the copy upstream.