Debugging MySQL SSL problems

This is not necessarily going to be a comprehensive post, but I learned somethings about MySQL SSL today that I thought would be worth sharing.

I was setting up a PRM install for a customer and one of the requirements was SSL replication.  In this particular case, I had setup PRM first, and then was working to get the other requirements configured.  I knew from experience that it was best to ensure SSL was working properly from the command line first before trying to get replication to use it via PRM’s automation that does the CHANGE MASTER for you.  Eliminate the variables.

The customer provided me with the CA cert, the private key, and the server key, and this was already working on an existing environment with the same MySQL version.  I had already added the relevant config options in the ‘mysqld’ section of the my.cnf:

[mysqld]
ssl-key=/etc/mysql/ssl/server-key.pem
ssl-cert=/etc/mysql/ssl/server-cert.pem
ssl-ca=/etc/mysql/ssl/ca-cert.pem

Now, maybe the rest of the world finds it easy to understand the difference between the server and client keys and certs, and if you need all these options on the client side to connect with SSL to mysql, but I’ve always found it confusing.  From what I can tell, the client-side key and cert is really only necessary if you need the server to authenticate the client.  If you just need raw encryption without that validation, it’s enough to just give the client the CA cert:

mysql --ssl-ca=/etc/mysql/ssl/ca-cert.pem

However in my case, I immediately got the dreaded:

ERROR 2026 (HY000): SSL connection error

As is normally the case when one encounters this error, the following steps were taken (in no particular order, or many times in some cases):

  • Checking the mysqld error log (clean as a whistle!)
  • Trying to find an option for the mysql cli that will output the actual SSL error (nary to be found!)
  • Running: SHOW VARIABLES LIKE ‘%ssl%’ (mysql was adamant that had openssl and ssl, and it had my SSL files correctly)
  • Checking grants
  • Double-checking the my.cnf for misspellings
  • Restarting mysql
  • Checking file perms of the ssl files
  • Confirming that SSL worked fine on the old environment
  • Contemplating a nice career in baking

I went through a variety of theories before I found the right one.  They included:

  • Something was just wrong with the SSL files
  • This env was a major OS version newer than the old, but we were using the same MySQL packages built for the old OS, so maybe an openssl compatibility error
  • This just wasn’t my day
  • Maybe it was the chroot requirement

“Aha!”, you say.  chroot, you fool, of course that’s the problem! And indeed it was, but I’m getting ahead of myself.  How did I make such a revelation?  Well, the openssl cli lets to setup a simple SSL client and server, which turns out to be a great way to verify your SSL environment and keys/certs are working properly.  The basic test goes like this:

Start up a basic HTTP-ish server:

/usr/bin/openssl s_server -cert /etc/mysql/ssl/server-cert.pem -key /etc/mysql/ssl/server-key.pem -www

Now run a client against that server:

openssl s_client -CAfile /etc/mysql/ssl/ca-cert.pem -connect 127.0.0.1:4433

If all is working, then your client should output a copy of the server certificate and generally not give any errors.  If there are errors, it’s helpful to look at what the server outputs.

In my case, this worked fine.  However, once I started suspecting the chroot environment, I was able to quickly start a basic MySQL server outside the chroot and confirm SSL worked fine there.  MySQL worked fine outside the chroot, and so did the basic openssl server, so how do I test the chroot environment?  By running the openssl s_server in the chroot, of course:

chroot /the/root /usr/bin/openssl s_server -cert /etc/mysql/ssl/server-cert.pem -key /etc/mysql/ssl/server-key.pem -www

I can run the client outside the root because I’m connecting over TCP, though there might be some cases where it’s helpful to also run that in the root.  However, in my case the openssl s_server quickly and efficiently told me everything that was wrong.  This included (as the Ops geniuses out there have already been screaming):

  • a big pile of libraries missing from the chroot
  • the absence of /etc/pki/tls/openssl.cnf (though I’m not convinced that’s necessary)
  • that I needed some more “randomness”, which quickly led me to needing to setup /dev/random and /dev/urandom in the chroot.

Go figure!  After realizing the big gap in my script to build the chroot directories, I finally got what I had been looking for:

# mysql --ssl-ca=/data/mysql/etc/mysql/DigiCertCA.pem
...
mysql>
...
 --------------Connection id: 101
 Current database:
 Current user: root@localhost
SSL: Cipher in use is DHE-RSA-AES256-SHA

After this, it was trivial to get replication fully working with SSL.  Now that I’ve been through it once, it should be easy to debug in the future.