Deep Dive into GTIDs and MySQL 5.6 - What, Why and How

Global Transaction Identifiers (GTIDs) are one of the key replication enhancements in MySQL 5.6. GTIDs make it simple to track and compare replication across a master - slave topology. This enables:

- Much simpler recovery from failures of the master,

- Introduces great flexibility in the provisioning and on-going management of multi-tier or ring (circular) replication topologies.

A new on-demand MySQL 5.6 GTID webinar delivered by the replication engineering team is now available, providing deep insight into the design and implementation of GTIDs, and how they enable users to simplify MySQL scaling and HA. The webinar covers:

- Concepts: What is a GTID? How does the server generate GTIDs? What is the life cycle of GTIDs? How are GTIDs used to connect to a master?

- Handling conflicts

- How to skip transactions using GTIDs

- What happens when binary logs are purged

- How to provision a new slave or restore from a backup

- MySQL utilities for automated failover and controlled switchover

To whet your appetite, an extract of the Q&A from the webinar is as follows. These, and many other questions were answered during the session:

Q. Which versions of MySQL support GTIDs?

A. MySQL 5.6.5 and above

Q. Is GTID ON by default in 5.6?

A. It is OFF by default

Q. What does the GTID contain?

A. It is made up of a unique ID for the server followed by an ever-increasing counter that's specific to that server

Q: Do GTIDs introduce any increased space requirements?

A: Yes, since GTIDs are stored in the binary log, the binary logs will be larger. However, we expect the overhead to be relatively small. GTIDs are written to the binary log in two places:

(1) A small header is stored in the beginning of the binary log. This contains the variable @@gtid_purged, i.e., a list of previously logged GTIDS. Since the list is range-compressed, this is expected to be small: a small fixed-size header plus 40 bytes times the number of master servers in your topology.

(2) A fixed size header is added before each transaction in the binary log. This is 44 bytes, so will typically be small compared to the transaction itself.

Q. I understand GTID's are associated with Transactions. How do they map to the events within each transaction, or do GTID's map as an event itself in a binlog file?

A. Yes, GTIDs are associated with transactions. In the binary log, the GTID is realized as an event written prior to the events that constitute the transaction. The event is called a Gtid_log_event.

Q What if a transaction spans a filtered out table and a non-filtered out table? How does it get recorded on the slave?

A. If the filters are on the master, then a partly logged transaction will be replicated with its GTID.

If filtering on the slave side, a partial image will be processed on the slave and the original GTID is logged (to the slave's binlog) with the processed transaction.

Q. Prior to GTID, to build a new slave, we use mysqldump --master-data=1 to get the slave starting sync point in the dump. With GTID enabled, does it set the gtid_executed / purged in the dump instead?

A. Yes, mysqldump will detect that the server uses GTIDs and output a SET GTID_PURGED statement. (And there is an option to turn off that, e.g., in case you want to execute the output on an old server).

Q. How do GTIDs enable failover and recovery?

A. GTIDs are using in combination with the MySQL utilities. The mysqlfailover and rpladmin utilities provide administration of GTID-enabled slaves, enabling monitoring with automatic failover and on-demand switchover, coupled with slave promotion. GTIDs make it straightforward to reliably failover from the master to the most current slave automatically in the event of a failure. DBAs no longer need to manually analyze the status of each of their slaves to identify the most current when seeking a target to promote to the new master.

Resources to Get Started

In addition to the webinar, here are some other key resources that will give you the detail you need to take advantage of GTIDs in your most important MySQL workloads:

- Engineering blog: Global Transaction Identifiers – why, what, and how

- Engineering blog: Advanced use of GTIDs

- Documentation: Setting up replication with GTIDs 

- Video Tutorial: MySQL replication utilities for auto-failover and switchover 

- Engineering Blog: Controlling read consistency with GTIDs 

If you have any comments, questions or feature requests, don't hesitate to leave a comment on this blog