To help the more than 1.23 billion people who use Facebook to
share and connect with each other, we’ve had to build an
expansive and incredibly advanced infrastructure -- including one
of the largest deployments of MySQL in the world. Along the way,
we’ve learned and benefited from code changes made by the MySQL
community. Today we’re announcing WebScaleSQL, a collaboration
among engineers from several companies that face similar
challenges in running MySQL at scale and seek greater performance
from a database technology tailored to their needs.
WebScaleSQL currently includes contributions from MySQL engineering teams at Facebook, Google, LinkedIn, and Twitter. Together, we’re working to share a common base of code changes to the upstream MySQL branch that we can all use and that will be made available via open source. This collaboration will expand on existing efforts by the MySQL community, and we will continue to track the upstream branch that is the latest, production-ready release (currently MySQL 5.6).
Our goal in launching WebScaleSQL is to enable the scale-oriented
members of the MySQL community to work more closely together in
order to prioritize the aspects that are most important to us. We
aim to create a more integrated system of knowledge-sharing to
help companies leverage the great features already found in MySQL
5.6, while building and adding more features that are specific to
deployments in large scale environments. In the last few months,
engineers from all four companies have contributed code and
provided feedback to each other to develop a new, more unified,
and more collaborative branch of MySQL.
But as effective as this collaboration has been so far, we know we’re not the only ones who are trying to solve these particular challenges. So we will keep WebScaleSQL open as we go, to encourage others who have the scale and resources to customize MySQL to join in our efforts. And of course we will welcome input from anyone who wants to contribute, regardless of what they’re currently working on.
What we’ve built so far:
We want WebScaleSQL to be able to collaborate effectively and to move fast. To that end, we have set up a system for collaborating, reviewing code, and reporting bugs. For example, to introduce a code change, a WebScaleSQL engineer can propose a change. Then a WebScaleSQL engineer from another company will review the code and provide feedback. If both engineers agree the change makes sense and is functional, it
will be pushed into the WebScaleSQL branch for everyone to use.
Beyond this, each organization may further customize WebScaleSQL
to suit its own needs, just as we all do today.
This has already produced exciting results. Working together, the engineers involved in WebScaleSQL have made major changes to aid in the development of the new branch, including:
- An automated framework that will, for each proposed change, run and publish the results of MySQL's built-in test system (mtr).
- A full new suite of stress tests (https://github.com/webscalesql/webscalesql-5.6/commit/8b6adf69913226cab5cf8aaf45914e66b812692d) and a prototype automated performance testing system.
- Several changes to the tests already found in MySQL, and to the structure of some existing code, to avoid problems where otherwise safe code changes had previously caused tests to fail or caused unnecessary conflicts. These changes make it easier to work on the code and helped us get started creating WebScaleSQL.
- Several changes to improve the performance of WebScaleSQL, including buffer pool flushing improvements (https://github.com/webscalesql/webscalesql-5.6/commit/1aa4d3cf18f71d7e30da35cc4082a786c2870f49, https://github.com/webscalesql/webscalesql-5.6/commit/d90a06daebb3abbbb3aacfe23168a33c7a940c4a), optimizations to certain types of queries (https://github.com/webscalesql/webscalesql-5.6/commit/d72b580597fecbdbb5b2f96cc9f57c946889fea4), support for NUMA interleave policy (https://github.com/webscalesql/webscalesql-5.6/commit/175520ac44545decff760506fa24b98ea5c21dff), and more.
- New features that make operating WebScaleSQL at true web scale easier, such as super_read_only (https://github.com/webscalesql/webscalesql-5.6/commit/4142091449dd439d473ab22f2e5d60b326e01dc7), and the ability to specify sub-second client timeouts (https://github.com/webscalesql/webscalesql-5.6/commit/c1d98ebd607c571f554e96c2b477a7d9f826b4bf).
What we’re working on now:
After these initial accomplishments, we’ve started work on a number of other improvements to upstream MySQL. A few activities that Facebook’s WebScaleSQL team is currently working on:
- Contributing an asynchronous MySQL client (https://reviews.facebook.net/D17025, https://reviews.facebook.net/D17031) which means that while querying MySQL, we don’t have to wait to connect, send, or retrieve. This non-blocking client (http://www.percona.com/live/mysql-conference-2014/sessions/asynchronous-mysql-how-facebook-queries-databases) is currently being code-reviewed by the other WebScaleSQL teams, after being used in production at Facebook for many months.
- Preparing to move Facebook's production-tested versions of table, user, and compression statistics into WebScaleSQL.
- Preparing to push the remaining components of Facebook's current production-tested version of compression that were not already included in MySQL 5.6 into WebScaleSQL.
- Adding the Logical Read-Ahead mechanism (http://yoshinorimatsunobu.blogspot.com/2013/10/making-full-table-scan-10x-faster-in.html) that we have proven in production to achieve large, quantifiable speed improvements (up to 10x) to full table scans, such as nightly logical back-ups.
What to expect in the future:
We will keep all our WebScaleSQL work open, to create a useful branch for others within the MySQL community who are focused on scale deployments. We’ll continue to follow the most up-to-date upstream version of MySQL. As long as the MySQL community releases continue, we are committed to remaining a branch – and not a fork – of MySQL.
We’re excited to expand our existing work on WebScaleSQL, and we think that this collaboration represents an opportunity for the scale-oriented members of the MySQL community to work together in a more efficient and transparent way that will benefit us all.
To learn more about how to get involved, visit: http://webscalesql.org/