Self-Signed SSL/TLS Certificates: Why They are Terrible and a Better Alternative

A Primer on SSL/TLS Certificates

Many of my readers (being technical folks) are probably already aware of the purpose and value of certificates, but in case you are not familiar with them, here’s a quick overview of what they are and how they work.

First, we’ll discuss public-key encryption and public-key infrastructure (PKI). It was realized very early on in human history that sometimes you want to communicate with other people in a way that prevents unauthorized people from listening in. All throughout time, people have been devising mechanisms for obfuscating communication in ways that only the intended recipient of the code would be able to understand. This obfuscation is called encryption, the data being encrypted is called plaintext and the encrypted data is called ciphertext. The cipher is the mathematical transformation that is used to turn the plaintext into the ciphertext and relies upon one or more keys known only to trusted individuals to get the plaintext back.

Early forms of encryption were mainly “symmetric” encryption, meaning that the cipher used the same key for both encryption and decryption. If you’ve ever added a password to a PDF document or a ZIP file, you have been using symmetric encryption. The password is a human-understandable version of a key. For a visual metaphor, think about the key to your front door. You may have one or more such keys, but they’re all exactly alike and each one of them can both lock and unlock the door and let someone in.

Nowadays we also have forms of encryption that are “asymmetric”. What this means is that one key is used to encrypt the message and a completely different key is used to decrypt it. This is a bit harder for many people to grasp, but it works on the basic mathematical principle that some actions are much more complicated to reverse than others. (A good example I’ve heard cited is that it’s pretty easy to figure out the square of any number with a pencil and a couple minutes, but most people can’t figure out a square-root without a modern calculator). This is harder to visualize, but the general idea is that once you lock the door with one key, only the other one can unlock it. Not even the one that locked it in the first place.

So where does the “public” part of public-key infrastructure come in? What normally happens is that once an asymmetric key-pair is generated, the user will keep one of those two keys very secure and private, so that only they have access to it. The other one will be handed out freely through some mechanism to anyone at all that wants to talk to you. Then, if they want to send you a message, they simply encrypt their message using your public key and they know you are the only one who can decrypt it. On the flip side, if the user wanted to send a public message but provide assurance that it came from them, they can also sign a message with the private key, so that the message will contain a special signature that can be decrypted with their public key. Since only one person should have that key, recipients can trust it came from them.

Astute readers will see the catch here: how do users know for certain that your public key is in fact yours? The answer is that they need to have a way of verifying it. We call this establishing trust and it’s exceedingly important (and, not surprisingly, the basis for the rest of this blog entry). There are many ways to establish trust, with the most foolproof being to receive the public key directly from the other party while looking at two forms of picture identification. Obviously, that’s not convenient for the global economy, so there needs to be other mechanisms.

Let’s say the user wants to run a webserver at “www.mydomain.com”. This server might handle private user data (such as their home address), so a wise administrator will set the server up to use HTTPS (secure HTTP). This means that they need a public and private key (which in this case we call a certificate). The common way to do this is for the user to contact a well-known certificate authority and purchase a signature from them. The certificate authority will do the hard work of verifying the user’s identity and then sign their webserver certificate with the CA’s own private key, thus providing trust by way of a third-party. Many well-known certificate authorities have their public keys shipped by default in a variety of operating systems, since the manufacturers of those systems have independently verified the CAs in turn. Now everyone who comes to the site will see the nice green padlock on their URL bar that means their communications are encrypted.

A Primer on Self-Signed Certificates

One of the major drawbacks to purchasing a CA signature is that it isn’t cheap: the CAs (with the exception of Let’s Encrypt) are out there to make money. When you’re developing a new application, you’re going to want to test that everything works with encryption, but you probably aren’t going to want to shell out cash for every test server and virtual machine that you create.

The solution to this has traditionally been to create what is called a self-signed certificate. What this means is that instead of having your certificate signed by a certificate authority, you instead use the certificates public key to add a signature to the private key. The problem with this approach is that web browsers and other clients that verify the security of the connection will be unable to verify that the server is who it says it is. In most cases, the user will be presented with a warning page that informs them that the server is pretending to be the one you went to. When setting up a test server, this is expected. Unfortunately, however, clicking through and saying “I’m sure I want to connect” has a tendency to form bad habits in users and often results in them eventually clicking through when they shouldn’t.

It should be pretty obvious, but I’ll say it anyway: Never use a self-signed certificate for a production website.

One of the problems we need to solve is how to avoid training users to ignore those warnings. One way that people often do this is to load their self-signed certificate into their local trust store (the list of certificate authorities that are trusted, usually provided by the operating system vendor but available to be extended by the user). This can have some unexpected consequences, however. For example, if the test machine is shared by multiple users (or is breached in a malicious attack), then the private key for the certificate might fall into other hands that would then use it to sign additional (potentially malicious) sites. And your computer wouldn’t try to warn you because the site would be signed by a trusted authority!

So now it seems like we’re in a Catch-22 situation: If we load the certificate into the trusted authorities list, we run the risk of a compromised private key for that certificate tricking us into a man-in-the-middle attack somewhere and stealing valuable data. If we don’t load it into the trust store, then we are constantly bombarded by a warning page that we have to ignore (or in the case of non-browser clients, we may have to pass an option not to verify the client) in which case we could still end up in a man-in-the-middle attack, because we’re blindly trusting the connection. Neither of those seems like a great option. What’s a sensible person to do?

Two Better Solutions

So, let’s take both of the situations we just learned about and see if we can locate a middle ground somewhere. Let’s go over what we know:

  • We need to have encryption to protect our data from prying eyes.
  • Our clients need to be able to trust that they are talking to the right system at the other end of the conversation.
  • If the certificate isn’t signed by a certificate in our trust store, the browser or other clients will warn or block us, training the user to skip validation.
  • If the certificate is signed by a certificate in our trust store, then clients will silently accept it.
  • Getting a certificate signed by a well-known CA can be too expensive for an R&D project, but we don’t want to put developers’ machines at risk.

So there are two better ways to deal with this. One is to have an organization-wide certificate authority rather than a public one. This should be managed by the Information Technologies staff. Then, R&D can submit their certificates to the IT department for signing and all company systems will implicitly trust that signature. This approach is powerful, but can also be difficult to set up (particularly in companies with a bring-your-own-device policy in place). So let’s look at a another solution that’s closer to the self-signed approach.

The other way to deal with it would be to create a simple site-specific certificate authority for use just in signing the development/test certificate. In other words, instead of generating a self-signed certificate, you would generate two certificates: one for the service and one to sign that certificate. Then (and this is the key point – pardon the pun), you must delete and destroy the private key for the certificate that did the signing. As a result, only the public key of that private CA will remain in existence, and it will only have ever signed a single service. Then you can provide the public key of this certificate authority to anyone who should have access to the service and they can add this one-time-use CA to their trust store.

Now, I will stress that the same rule holds true here as for self-signed certificates: do not use this setup for a production system. Use a trusted signing authority for such sites. It’s far easier on your users.

A Tool and a Tale

I came up with this approach while I was working on solving some problems for the Fedora Project. Specifically, we wanted to come up with a way to ensure that we could easily and automatically generate a certificate for services that should be running on initial start-up (such as Cockpit or OpenPegasus). Historically, Fedora had been using self-signed certificates, but the downsides I listed above gnawed at me, so I put some time into it and came up with the private-CA approach.

In addition to the algorithm described above, I’ve also built a proof-of-concept tool called sscg (the Simple Signed Certificate Generator) to easily enable the creation of these certificates (and to do so in a way that never drops the CA’s private key onto a filesystem; it remains in memory). I originally wrote it in Python 3 and that version is packaged for use in Fedora today. This past week as a self-assigned exercise to improve my knowledge of Go, I rewrote the sscg in that language. It was a fun project and had the added benefit of removing the fairly heavyweight dependency on the Python 3 version. I plan to package the golang version for Fedora 25 at some point in the near future, but if you’d like to try it out, you can clone my github repository. Patches and suggestions for functionality are most welcome.

Rolekit (or “How I learned to stop thinking in terms of packages”)

What’s the problem?

Let’s start with a simplification and discuss the lifecycle of software at a high-level:

  1. Research and Development – In this phase, the software is designed, coded and (hopefully) tested.
  2. Packaging – Here, we take the compiled, tested bits of the software and bundle it up into some sort of package that can be used to deliver it to a user.
  3. Deployment – An end-user takes the package and does something interesting with it (for the purists out there, I’m lumping the test, staging and production environments into the “deployment” category).

Despite the brevity of the list above, there are a lot of moving parts here. I’m going to use the Fedora process to illustrate how this all works in a pre-rolekit world and then talk a little bit about the limitations, some of the alternatives and finally how rolekit addresses the issue. First, though, I’ll answer the question I posited in the header: “What’s the problem?”

The problem to be solved is how to get useful software up and running in an end-user’s environment with the least amount of difficulty for the user. The first and most important rule in software is this: software is a means to an end, not an end unto itself. People install a piece of software in order to achieve a goal. This goal could be something relatively simple, such as “I want to listen to this MP3 I bought” or as complex as “I run the IT department for a multinational manufacturing company and I want to keep track of all my products, the rate of their sales and margins as well as what my competitors are doing”. The job of software is to enable the user to get to that desired state. To that end, I would argue this: it is far more important to help the user get started than it is to offer them every possible feature.

Some of you may interject: “But if you don’t have the feature they need, won’t they go to someone who does?”. Sure, sometimes that will happen. But you will probably discover that people will make a different tradeoff than you might think: “I can get 90% of what I need and get it set up in a few weeks” is a far more compelling statement to make to a financial decision-maker than “This product provides everything we need, but I’ll need two more full-time people to get it running next year”.

What are we doing today?

Open source development is fairly unique compared to traditional software development. One of its major advantages for development can also become its biggest challenge to deployment. Because of the breadth of open source projects out there, there is almost always someone who has done at least a piece of what you want to do already. These other projects, such as coding libraries, web application frameworks, video game engines, etc. all provide the building blocks to start your work. The great thing here is that you can pick up the pieces that you need from somewhere else and then focus your attention only on the parts that make your project unique or exciting.

However, the challenge starts happening when you get to the packaging phase. Now that you have something you want to share with the world, you need to package it in a manner that allows them to use it. There are generally two schools of thought on how to do this, each with their own strengths and weaknesses.

  1. Grab the source code (or pre-built binaries) for everything that you depend on for your project to work and package them all together in a single deliverable.
  2. Package all of your dependencies separately in their own deliverables

I’m not going to go into the details of why, but the Fedora Project has policies that require the second option. (If you’re interested in the reasoning, I strongly recommend reading the Fedora Packaging Guidelines page on the subject). Fedora then provides a dependency-resolution mechanism that simplifies this case by ensuring that when you attempt to retrieve the package you want, it also automatically installs all of the packages that it depends on (and so on, recursively until they are all satisfied).

How do we deploy it now?

There are two schools of thought on this subject, which I will refer to as the “Fedora Approach” and the “Debian Approach”, since those two Linux distributions best represent them. (Note: my understanding of the Debian Approach is second-hand, so if I get any of the subtleties incorrect, please feel free to leave a comment and I’ll correct it).

The Debian Approach

In Debian and its derivatives (such as Ubuntu, Mint, etc.), when the package resolution is completed and the packages are downloaded, the user is required to indicate at that time their explicit decision on how the package must behave. Through a system called “debconf”, package installation is directly tied to deployment; the package installation cannot conclude without it being explicitly configured at that time. If the installation is non-interactive (such as if the installation was initiated by another service, rather than the user), the configuration must either be specified by an “answer file” (a configuration file passed to debconf stating the answers in advance) or else the package must provide a sensible set of defaults to automatically deploy it.

 The Fedora Approach

In Fedora and its derivatives (such as Red Hat Enterprise Linux, CentOS, Scientific Linux, etc.), when the package resolution is completed and the packages are downloaded, that’s it. In the vast majority of cases, the software is now on the system, but it is not configured to do anything at all. (There are a few specific exceptions which have been granted by the Fedora Engineering Steering Committee for things like the firewall). On these systems, nothing will happen until the user takes an explicit action to configure and start the services.

“That sounds like the Debian Approach is better!” you may say. However, there are concerns to be had here. For one, the above explanation I made about dependency-resolution comes into play; you as a user may not be fully aware of what packages are going to be pulled in by your dependencies (even accidentally). Furthermore, just because you installed a web-server package, it doesn’t mean that you necessarily want it running immediately. So, Fedora forces you to make these decisions explicitly, rather than implicitly. So when you’re ready, you configure the software and then start it up.

Where does this fall down?

The real problem is that the concept of “packages” derives very much from the engineering side of things. A package is a logical bundling of software for the developers. Not all problems can be solved with a single package, though. For example, the FreeIPA identity-management solution requires many top-level packages including an LDAP directory server, a certificate authority server, a DNS server and others. In this, the concept of a “package” gets more than a little fuzzy. In this particular case (as has been common historically), the solution was “Let’s make another package that glues them together!”. So the FreeIPA package just adds those other packages to its dependency chain.

But just adding more packages doesn’t necessarily solve the end-user concern: How do I easily deploy this?

Enter rolekit

Rolekit was designed to be specifically for handling the deployment situation and shield end-users from the concept of project-level packages. Instead, complete solutions will be “packaged” as Server Roles. Users will come to rolekit and declare a machine to be e.g. a Domain Controller, providing the minimum information necessary to set it up (today, that’s just an admin password in the Domain Controller example). Rolekit will handle all of the other necessary work under the hood, which involves downloading the appropriate packages, installing them on the system, setting up the configuration, starting the appropriate services and carefully opening up the firewall to allow access to it.

There are a lot of moving parts involved in deploying a role, but the user doesn’t really need to know what they are. If they can be shielded from much of the noise and churn inherent in package installation, configuration, service management and firewall settings, then they get back much of their time for solving the problems unique to their environments.

Fedora and Server Roles

As of Fedora 21, we have implemented the first release of the rolekit framework as well as a single representative Role: the Domain Controller. For Fedora 22, we’re working with the Cockpit project to produce a simple and powerful graphical interface to deploy the Domain Controller Role as well as building a new Database Server Role. As the project progresses, we very much hope that others will come forward to help us build more solutions. A few that I’d love to see (but don’t have time to start on yet):

  • A fileserver role that manages Samba and NFS file-shares (maybe [s]ftp as well).
  • A mail and/or groupware server role built atop something like Kolab
  • A backup server

Welcome to the post-package world, my friends!