Rolekit (or “How I learned to stop thinking in terms of packages”)

What’s the problem?

Let’s start with a simplification and discuss the lifecycle of software at a high-level:

  1. Research and Development – In this phase, the software is designed, coded and (hopefully) tested.
  2. Packaging – Here, we take the compiled, tested bits of the software and bundle it up into some sort of package that can be used to deliver it to a user.
  3. Deployment – An end-user takes the package and does something interesting with it (for the purists out there, I’m lumping the test, staging and production environments into the “deployment” category).

Despite the brevity of the list above, there are a lot of moving parts here. I’m going to use the Fedora process to illustrate how this all works in a pre-rolekit world and then talk a little bit about the limitations, some of the alternatives and finally how rolekit addresses the issue. First, though, I’ll answer the question I posited in the header: “What’s the problem?”

The problem to be solved is how to get useful software up and running in an end-user’s environment with the least amount of difficulty for the user. The first and most important rule in software is this: software is a means to an end, not an end unto itself. People install a piece of software in order to achieve a goal. This goal could be something relatively simple, such as “I want to listen to this MP3 I bought” or as complex as “I run the IT department for a multinational manufacturing company and I want to keep track of all my products, the rate of their sales and margins as well as what my competitors are doing”. The job of software is to enable the user to get to that desired state. To that end, I would argue this: it is far more important to help the user get started than it is to offer them every possible feature.

Some of you may interject: “But if you don’t have the feature they need, won’t they go to someone who does?”. Sure, sometimes that will happen. But you will probably discover that people will make a different tradeoff than you might think: “I can get 90% of what I need and get it set up in a few weeks” is a far more compelling statement to make to a financial decision-maker than “This product provides everything we need, but I’ll need two more full-time people to get it running next year”.

What are we doing today?

Open source development is fairly unique compared to traditional software development. One of its major advantages for development can also become its biggest challenge to deployment. Because of the breadth of open source projects out there, there is almost always someone who has done at least a piece of what you want to do already. These other projects, such as coding libraries, web application frameworks, video game engines, etc. all provide the building blocks to start your work. The great thing here is that you can pick up the pieces that you need from somewhere else and then focus your attention only on the parts that make your project unique or exciting.

However, the challenge starts happening when you get to the packaging phase. Now that you have something you want to share with the world, you need to package it in a manner that allows them to use it. There are generally two schools of thought on how to do this, each with their own strengths and weaknesses.

  1. Grab the source code (or pre-built binaries) for everything that you depend on for your project to work and package them all together in a single deliverable.
  2. Package all of your dependencies separately in their own deliverables

I’m not going to go into the details of why, but the Fedora Project has policies that require the second option. (If you’re interested in the reasoning, I strongly recommend reading the Fedora Packaging Guidelines page on the subject). Fedora then provides a dependency-resolution mechanism that simplifies this case by ensuring that when you attempt to retrieve the package you want, it also automatically installs all of the packages that it depends on (and so on, recursively until they are all satisfied).

How do we deploy it now?

There are two schools of thought on this subject, which I will refer to as the “Fedora Approach” and the “Debian Approach”, since those two Linux distributions best represent them. (Note: my understanding of the Debian Approach is second-hand, so if I get any of the subtleties incorrect, please feel free to leave a comment and I’ll correct it).

The Debian Approach

In Debian and its derivatives (such as Ubuntu, Mint, etc.), when the package resolution is completed and the packages are downloaded, the user is required to indicate at that time their explicit decision on how the package must behave. Through a system called “debconf”, package installation is directly tied to deployment; the package installation cannot conclude without it being explicitly configured at that time. If the installation is non-interactive (such as if the installation was initiated by another service, rather than the user), the configuration must either be specified by an “answer file” (a configuration file passed to debconf stating the answers in advance) or else the package must provide a sensible set of defaults to automatically deploy it.

 The Fedora Approach

In Fedora and its derivatives (such as Red Hat Enterprise Linux, CentOS, Scientific Linux, etc.), when the package resolution is completed and the packages are downloaded, that’s it. In the vast majority of cases, the software is now on the system, but it is not configured to do anything at all. (There are a few specific exceptions which have been granted by the Fedora Engineering Steering Committee for things like the firewall). On these systems, nothing will happen until the user takes an explicit action to configure and start the services.

“That sounds like the Debian Approach is better!” you may say. However, there are concerns to be had here. For one, the above explanation I made about dependency-resolution comes into play; you as a user may not be fully aware of what packages are going to be pulled in by your dependencies (even accidentally). Furthermore, just because you installed a web-server package, it doesn’t mean that you necessarily want it running immediately. So, Fedora forces you to make these decisions explicitly, rather than implicitly. So when you’re ready, you configure the software and then start it up.

Where does this fall down?

The real problem is that the concept of “packages” derives very much from the engineering side of things. A package is a logical bundling of software for the developers. Not all problems can be solved with a single package, though. For example, the FreeIPA identity-management solution requires many top-level packages including an LDAP directory server, a certificate authority server, a DNS server and others. In this, the concept of a “package” gets more than a little fuzzy. In this particular case (as has been common historically), the solution was “Let’s make another package that glues them together!”. So the FreeIPA package just adds those other packages to its dependency chain.

But just adding more packages doesn’t necessarily solve the end-user concern: How do I easily deploy this?

Enter rolekit

Rolekit was designed to be specifically for handling the deployment situation and shield end-users from the concept of project-level packages. Instead, complete solutions will be “packaged” as Server Roles. Users will come to rolekit and declare a machine to be e.g. a Domain Controller, providing the minimum information necessary to set it up (today, that’s just an admin password in the Domain Controller example). Rolekit will handle all of the other necessary work under the hood, which involves downloading the appropriate packages, installing them on the system, setting up the configuration, starting the appropriate services and carefully opening up the firewall to allow access to it.

There are a lot of moving parts involved in deploying a role, but the user doesn’t really need to know what they are. If they can be shielded from much of the noise and churn inherent in package installation, configuration, service management and firewall settings, then they get back much of their time for solving the problems unique to their environments.

Fedora and Server Roles

As of Fedora 21, we have implemented the first release of the rolekit framework as well as a single representative Role: the Domain Controller. For Fedora 22, we’re working with the Cockpit project to produce a simple and powerful graphical interface to deploy the Domain Controller Role as well as building a new Database Server Role. As the project progresses, we very much hope that others will come forward to help us build more solutions. A few that I’d love to see (but don’t have time to start on yet):

  • A fileserver role that manages Samba and NFS file-shares (maybe [s]ftp as well).
  • A mail and/or groupware server role built atop something like Kolab
  • A backup server

Welcome to the post-package world, my friends!