A Primer on SSL/TLS Certificates
Many of my readers (being technical folks) are probably already aware of the purpose and value of certificates, but in case you are not familiar with them, here’s a quick overview of what they are and how they work.
First, we’ll discuss public-key encryption and public-key infrastructure (PKI). It was realized very early on in human history that sometimes you want to communicate with other people in a way that prevents unauthorized people from listening in. All throughout time, people have been devising mechanisms for obfuscating communication in ways that only the intended recipient of the code would be able to understand. This obfuscation is called encryption, the data being encrypted is called plaintext and the encrypted data is called ciphertext. The cipher is the mathematical transformation that is used to turn the plaintext into the ciphertext and relies upon one or more keys known only to trusted individuals to get the plaintext back.
Early forms of encryption were mainly “symmetric” encryption, meaning that the cipher used the same key for both encryption and decryption. If you’ve ever added a password to a PDF document or a ZIP file, you have been using symmetric encryption. The password is a human-understandable version of a key. For a visual metaphor, think about the key to your front door. You may have one or more such keys, but they’re all exactly alike and each one of them can both lock and unlock the door and let someone in.
Nowadays we also have forms of encryption that are “asymmetric”. What this means is that one key is used to encrypt the message and a completely different key is used to decrypt it. This is a bit harder for many people to grasp, but it works on the basic mathematical principle that some actions are much more complicated to reverse than others. (A good example I’ve heard cited is that it’s pretty easy to figure out the square of any number with a pencil and a couple minutes, but most people can’t figure out a square-root without a modern calculator). This is harder to visualize, but the general idea is that once you lock the door with one key, only the other one can unlock it. Not even the one that locked it in the first place.
So where does the “public” part of public-key infrastructure come in? What normally happens is that once an asymmetric key-pair is generated, the user will keep one of those two keys very secure and private, so that only they have access to it. The other one will be handed out freely through some mechanism to anyone at all that wants to talk to you. Then, if they want to send you a message, they simply encrypt their message using your public key and they know you are the only one who can decrypt it. On the flip side, if the user wanted to send a public message but provide assurance that it came from them, they can also sign a message with the private key, so that the message will contain a special signature that can be decrypted with their public key. Since only one person should have that key, recipients can trust it came from them.
Astute readers will see the catch here: how do users know for certain that your public key is in fact yours? The answer is that they need to have a way of verifying it. We call this establishing trust and it’s exceedingly important (and, not surprisingly, the basis for the rest of this blog entry). There are many ways to establish trust, with the most foolproof being to receive the public key directly from the other party while looking at two forms of picture identification. Obviously, that’s not convenient for the global economy, so there needs to be other mechanisms.
Let’s say the user wants to run a webserver at “www.mydomain.com”. This server might handle private user data (such as their home address), so a wise administrator will set the server up to use HTTPS (secure HTTP). This means that they need a public and private key (which in this case we call a certificate). The common way to do this is for the user to contact a well-known certificate authority and purchase a signature from them. The certificate authority will do the hard work of verifying the user’s identity and then sign their webserver certificate with the CA’s own private key, thus providing trust by way of a third-party. Many well-known certificate authorities have their public keys shipped by default in a variety of operating systems, since the manufacturers of those systems have independently verified the CAs in turn. Now everyone who comes to the site will see the nice green padlock on their URL bar that means their communications are encrypted.
A Primer on Self-Signed Certificates
One of the major drawbacks to purchasing a CA signature is that it isn’t cheap: the CAs (with the exception of Let’s Encrypt) are out there to make money. When you’re developing a new application, you’re going to want to test that everything works with encryption, but you probably aren’t going to want to shell out cash for every test server and virtual machine that you create.
The solution to this has traditionally been to create what is called a self-signed certificate. What this means is that instead of having your certificate signed by a certificate authority, you instead use the certificates public key to add a signature to the private key. The problem with this approach is that web browsers and other clients that verify the security of the connection will be unable to verify that the server is who it says it is. In most cases, the user will be presented with a warning page that informs them that the server is pretending to be the one you went to. When setting up a test server, this is expected. Unfortunately, however, clicking through and saying “I’m sure I want to connect” has a tendency to form bad habits in users and often results in them eventually clicking through when they shouldn’t.
It should be pretty obvious, but I’ll say it anyway: Never use a self-signed certificate for a production website.
One of the problems we need to solve is how to avoid training users to ignore those warnings. One way that people often do this is to load their self-signed certificate into their local trust store (the list of certificate authorities that are trusted, usually provided by the operating system vendor but available to be extended by the user). This can have some unexpected consequences, however. For example, if the test machine is shared by multiple users (or is breached in a malicious attack), then the private key for the certificate might fall into other hands that would then use it to sign additional (potentially malicious) sites. And your computer wouldn’t try to warn you because the site would be signed by a trusted authority!
So now it seems like we’re in a Catch-22 situation: If we load the certificate into the trusted authorities list, we run the risk of a compromised private key for that certificate tricking us into a man-in-the-middle attack somewhere and stealing valuable data. If we don’t load it into the trust store, then we are constantly bombarded by a warning page that we have to ignore (or in the case of non-browser clients, we may have to pass an option not to verify the client) in which case we could still end up in a man-in-the-middle attack, because we’re blindly trusting the connection. Neither of those seems like a great option. What’s a sensible person to do?
Two Better Solutions
So, let’s take both of the situations we just learned about and see if we can locate a middle ground somewhere. Let’s go over what we know:
- We need to have encryption to protect our data from prying eyes.
- Our clients need to be able to trust that they are talking to the right system at the other end of the conversation.
- If the certificate isn’t signed by a certificate in our trust store, the browser or other clients will warn or block us, training the user to skip validation.
- If the certificate is signed by a certificate in our trust store, then clients will silently accept it.
- Getting a certificate signed by a well-known CA can be too expensive for an R&D project, but we don’t want to put developers’ machines at risk.
So there are two better ways to deal with this. One is to have an organization-wide certificate authority rather than a public one. This should be managed by the Information Technologies staff. Then, R&D can submit their certificates to the IT department for signing and all company systems will implicitly trust that signature. This approach is powerful, but can also be difficult to set up (particularly in companies with a bring-your-own-device policy in place). So let’s look at a another solution that’s closer to the self-signed approach.
The other way to deal with it would be to create a simple site-specific certificate authority for use just in signing the development/test certificate. In other words, instead of generating a self-signed certificate, you would generate two certificates: one for the service and one to sign that certificate. Then (and this is the key point – pardon the pun), you must delete and destroy the private key for the certificate that did the signing. As a result, only the public key of that private CA will remain in existence, and it will only have ever signed a single service. Then you can provide the public key of this certificate authority to anyone who should have access to the service and they can add this one-time-use CA to their trust store.
Now, I will stress that the same rule holds true here as for self-signed certificates: do not use this setup for a production system. Use a trusted signing authority for such sites. It’s far easier on your users.
A Tool and a Tale
I came up with this approach while I was working on solving some problems for the Fedora Project. Specifically, we wanted to come up with a way to ensure that we could easily and automatically generate a certificate for services that should be running on initial start-up (such as Cockpit or OpenPegasus). Historically, Fedora had been using self-signed certificates, but the downsides I listed above gnawed at me, so I put some time into it and came up with the private-CA approach.
In addition to the algorithm described above, I’ve also built a proof-of-concept tool called sscg (the Simple Signed Certificate Generator) to easily enable the creation of these certificates (and to do so in a way that never drops the CA’s private key onto a filesystem; it remains in memory). I originally wrote it in Python 3 and that version is packaged for use in Fedora today. This past week as a self-assigned exercise to improve my knowledge of Go, I rewrote the sscg in that language. It was a fun project and had the added benefit of removing the fairly heavyweight dependency on the Python 3 version. I plan to package the golang version for Fedora 25 at some point in the near future, but if you’d like to try it out, you can clone my github repository. Patches and suggestions for functionality are most welcome.