The Road to MySQL 5.6: Default Options

When you're testing out a new version of MySQL in a non-production environment there is a temptation to go wild and turn on all kinds of new features.  Especially if you're reading the changelogs or the manual and scanning through options.  You want to start with the most reasonable set of defaults, right?  Maybe you're even doing benchmarks to optimize performance using all the new bells and whistles.

Resist the temptation!  If your goal is to upgrade your production environment then what you really want is to isolate changes.  You want to preform the upgrade with as little to no impact as possible.  Then you can start turning on features or making changes one-by-one.

Why?  Anytime you're doing a major upgrade to something as fundamental as your core RDBMS, there are many ways things can go wrong.  Performance regressions & incompatible changes, client/server incompatibilities …

9 Tips for Going in Production with Galera Cluster for MySQL

August 25, 2014 By Severalnines

Are you going in production with Galera Cluster for MySQL? Here are 9 tips to consider before going live. These are applicable to all 3 Galera versions (Codership, Percona XtraDB Cluster and MariaDB Galera Cluster). 


1. Galera strengths and weaknesses


There are multiple types of replication and cluster technologies for MySQL, make sure you understand how Galera works so you set the right expectations. Applications that run on single instance MySQL might not work well on Galera, you might need to make some changes to the application or the workload might not be appropriate. We’d suggest you have a look at these resources: 

New Webinar: Repair and Recovery for your MySQL, MariaDB and MongoDB/TokuMX Clusters

December 19, 2013 By Severalnines

Database clusters are pretty sophisticated distributed systems with complex dependencies between nodes. The failure of a node will generally impact the overall cluster, as the remaining nodes need to reconfigure themselves to continue to operate without the failed node. Since re-introducing a node will also affect the existing cluster, the timing could therefore be dependent on the state of the other nodes in the cluster. Repair and restarts often needs to be performed in a particular order in compliance with the redundancy model of the cluster so as not to jeopardize the normal functioning of existing nodes.


Webinar: Repair and Recovery for your MySQL, MariaDB and MongoDB/TokuMX clusters


Online Schema Upgrade in MySQL Galera Cluster using TOI Method

December 10, 2013 By Severalnines

As a follow-up to the Webinar on Zero Downtime Schema Changes in Galera Cluster, we’ll now walk you through the detailed steps on how to update your schema. The two methods (TOI and RSU) have both their pros and cons, and given parameters like table size, indexes, key_buffer_size, disk speed, etc., it is possible to estimate the time taken for the schema to be upgraded. Also, please note that a schema change is non-transactional so it would not be possible to rollback the DDL if it fails midway. Therefore, it is always recommended to test the schema changes and ensure you have recoverable backups before performing this on your production clusters. 

This post examines the way DDL changes are propagated in Galera, and outlines the steps to upgrade the …

Upcoming Webinar: Zero Downtime Schema Changes in Galera Cluster

November 14, 2013 By Severalnines

Database schema changes are usually not popular among DBAs or sysadmins, not when you are operating a cluster and cannot afford to switch off the service during a maintenance window. There are different ways to perform schema changes, some procedures being more complicated than others. We invited Seppo from the Codership team to tell us about the options. If you’d like to learn more, please register for our new webinar.


Webinar: Galera Cluster Best Practices - Zero Downtime Schema Changes

Tuesday, December 3rd 2013

Register now - Europe/MEA/APAC

Tips and tricks while working with Production DBs

From time to time we have to work with live environments and production databases. For some of us this is day-to-day job. And most of the time cost of a mistake is way higher than expected improvement especially on the databases. Because issue on the database side will affect everything else.

I heard enough war stories about ruined productions and can imagine well enough speed of DROP DATABASE command replicating across the cluster. So I’m scared to make changes in production. The more loss expected if things go wrong the more I’m going to be scared planning every change. But I still love to make improvements so the only question is how to make them safer.

This post is not intended to be a guide or best practices on how to avoid issues at all, it’s more invitation to discussion that started between me and @randomsurfer in twitter on how to avoid production failures. …

Interesting Resources for Technical Operations Engineers

As a leader of a technical operations team I often have to work on technical operations engineer hiring. This process involves a lot of interviews with candidates and during those interviews along with many challenging practical questions I really love to ask questions like “What are the most important resources you think an Operations Engineer should follow?”, “What books in your opinion are must-read for a techops engineer?” or “Who are your personal heroes in IT community?”. Those questions often give me a lot of information about candidates, their experience, who they are looking up to in the community, what they are interested in, and if they are actively working on improving their professional level.

Recently, one of the candidates asked me to share my lists with him and I thought this information could be valuable to other people so I have decided to share it here on my blog.

Must-Read Books List

First …

Quantifying Abnormal Behavior in System Metrics

I’ve posted slides for my Velocity talk on VividCortex’s blog. The talk explained how we use exponentially weighted moving statistics to generate a meta-metric of abnormality for the time-series metrics measured from MySQL. That’s kind of a mouthful. Maybe you had to be there :-)

Cloud Deployment Interview

What does a cloud computing expert need to know? In part one of the cloud interview guide we covered some basic unix & Linux systems administration skills, and cloud computing and infrastructure concepts. Those are key starting points. You might also want to jump to part 3 cloud dba, architecture and management interview questions.

In this second part, let’s dig into deploying applications in the cloud, and day to day operations skills. There’s a lot of material here. We …

How to avoid two backups running at the same time

When your backup script is running for too long it sometimes causes the second backup script starting at the time when previous backup is still running. This increasing pressure on the database, makes server slower, could start chain of backup processes and in some cases may break backup integrity.

Simplest solution is to avoid this undesired situation by adding locking to your backup script and prevent script to start second time when it’s already running.

Here is working sample. You will need to replace “sleep 10″ string with actual backup script call:


if [[ -e $LOCK_NAME ]] ; then
        echo "re-entry, exiting"
        exit 1

### Placing lock file
touch $LOCK_NAME
echo -n "Started..."

### Performing required work
sleep 10

### Removing lock
rm -f $LOCK_NAME

echo "Done."

It works perfectly most of the times. Problem is that you could still theoretically run two …

