Long time ago I wrote a blog post that we started using
Semi-Synchronous replication in production at Facebook. We are
running our own MySQL facebook-5.6 branch, which added many replication
(including Semisync) patches. Some of them were also merged with
WebScaleSQL.
Recently I heard from people from community that they were
suffering from Semisync performance issues in official MySQL 5.6.
I took some time to review our previous MySQL bug reports, then
realized that some important bugs were either still "verified" or
inefficiently fixed. Two most affecting bug reports were https://bugs.mysql.com/bug.php?id=70342 and
http://bugs.mysql.com/bug.php?id=70669. We fixed
both at our branch so I haven't paid much attention after that,
but people outside Facebook are certainly affected.
In this post, I'm going to describe some effective
configurations to get better Semisync throughput on master and
slaves, by showing simple benchmark numbers. I used three
machines -- client and master and semisync slave --, all running
on pure flash. They are located within very close distance. I
created 100 databases and enabled Multi-Threaded-Slave, and ran
100 mysqlslap processes for 100 databases, with 30 concurrent
connections each (3000 concurrent connections in total). All
queries were auto-committed inserts and I used InnoDB storage
engine on both master and slaves.
1. Set master_info_repository=TABLE
MySQL 5.6 and 5.7 have a
performance bug that writing FILE based master info (and
relay log info) files are very expensive. This is especially
serious for Semisync replication, since this bug slows down IO
thread. On Semisync, slow IO threads takes longer time to send
ACK back to the master, so it slows down master throughput as
well. Default master_info_repository is FILE, so without changing
this parameter to TABLE, you are affected by this bug.
Here are benchmark results between FILE and TABLE.
5.6.22 master_info_repository=FILE : 5870/s
5.6.22 master_info_repository=TABLE: 7365/s
These numbers were the number of commits per second on both
master and slave instances. Slave didn't lag for most
experiments, thanks to multi-threaded slave feature.
Please don't confuse between master_info_repository and
relay_log_info_repository parameters. relay_log_info_repository
has to be TABLE, otherwise crash safe slave doesn't work.
master_info_repository works on both FILE and TABLE, but I
suggest to use TABLE for performance reasons.
2. Reduce durability on master
Older 5.6 had a bug that slaves couldn't continue
replication after crashed master's recovery, even if setting
fully durable configurations. Here is a closed bug
report.
This bug report was closed, but it caused some performance
regression. Master extended LOCK_log mutex holding duration --
releasing LOCK_log mutex after calling fsync(). This certainly
fixed the bug, but caused performance regression because LOCK_log
was very hot mutex in 5.6 -- both Binlog Dump thread and
application threads need to hold the lock. Hopefully reducing
durability (I mean setting sync_binlog=0 and
innodb_flush_log_at_trx_commit=0|2) mitigates the regression a
lot. When using Semisync replication, you are most certainly to
promote a slave on master failure, so durability on master does
not matter much == You can reduce durability on master.
5.6.22 master_info_repository=TABLE, full durability:
7365/s
5.6.22 master_info_repository=TABLE, less durability:
9800/s
3. Loss Less Semisync
MySQL 5.7 improved replication performance a lot. Probably
the most effective improvement was that Binlog Dump thread no longer held LOCK_log
mutex. In addition to that, 5.7 introduced "Loss-Less Semisync" feature.
If you read my previous Semisync blog post carefully, you
may have noticed that 1. we backported "Loss-Less Semisync" from
5.7, and 2. we got better throughput with Loss-Less Semisync than
Normal Semisync. This was because Loss-Less Semisync actually
reduced mutex contentions -- LOCK_commit and LOCK_binlog_. My 5.7
benchmark result was as follows.
5.7 Normal Semisync: 12302/s
5.7 Loss Less Semisync: 14500/s
(master_info_repository=TABLE, less durable)
Reducing LOCK_log contention on Binlog Dump thread and
introducing Loss Less Semisync were major contributors in MySQL
5.7 performance improvements. At Facebook, we ported both in our
5.6 branch. It would be interesting for community if these can be
available on other distributions, since using 5.7 in production
will not happen anytime soon.
In addition to the above three configurations, there are still
some considerations to make Semisync throughput not bad /
better.
4. Semisync mysqlbinlog
At Facebook, we implemented mysqlbinlog to speak Semisync
protocol, and used it as a Semisync replication reader. On
replication slaves, IO thread and SQL thread conflict with
internal mutexes. As I mentioned above, slow IO thread slows down
Semisync master throughput. Semisync mysqlbinlog doesn't have
such slowdown, because it doesn't have SQL thread. So using
Semisync mysqlbinlog instead of Semisync slave can improve master
throughput.
5.7 Loss Less Semisync + Semisync slave: 14500/s (on both master
and slave)
5.7 Loss Less Semisync + Semisync mysqlbinlog + async slave:
48814/s on master, 15363/s on slave
This shows Semisync mysqlbinlog improved master throughput.
But this is actually not so good news -- because slave lags a
lot. Fundamentally we need to fix mutex contentions between IO
thread and SQL threads.
5. GTID
There are some open performance bugs in GTID. Especially
5.7 one is serious. If you really need high
throughput Semisync, you need to carefully benchmark with GTID
(and ask Oracle to fix!).
Here are whole table definitions and mysqlslap commands I used
for benchmark.
for i in `seq 1 100` do mysql -e "drop database test$i" mysql -e "create database test$i" mysql test$i -e "create table t1 (id int auto_increment primary key, \ value int, value2 varchar(100)) engine=innodb" done
for i in `seq 1 100` do mysqlslap --host=master --query="insert into test$i.t1 \ values (null, 1, 'abcdefghijklmn')" --number-of-queries=100000 --concurrency=30 & done