This past week was marked by a series of personal findings related to the use of Global Transaction IDs (GTIDs) on Galera-based clusters such as Percona XtraDB Cluster (PXC). The main one being the fact that transactions touching MyISAM tables (and FLUSH PRIVILEGES!) issued on a giving node of the cluster are recorded on a GTID set bearing the node’s server_uuid as “source id” and added to the binary log (if the node has binlog enabled), thus being replicated to any async replicas connected to it. However, they won’t be replicated across the cluster (that is, all of this is by design, if wsrep_replicate_myisam is …
[Read more]Say you have a cluster with 3 nodes using Percona XtraDB Cluster (PXC) 5.6 and one asynchronous replica connected to node1. If asynchronous replication is using GTIDs, moving the replica so that it is connected to node2 is trivial, right? Actually replication can easily break for reasons that may not be obvious at first sight.
Summary
Let’s assume we have the following setup with 3 PXC nodes and one asynchronous replica:
Regarding MySQL GTIDs, a Galera cluster behaves like a
distributed master: transactions coming from any node will use
the same auto-generated uuid. This auto-generated uuid is related
to the Galera uuid, it’s neither ABC, nor DEF, nor GHI.
Transactions executed for …
[Read more]January 27, 2015 By Severalnines
Unlike standard MySQL server and MySQL Cluster, the way to start a MySQL/MariaDB Galera Cluster is a bit different. Galera requires you to start a node in a cluster as a reference point, before the remaining nodes are able to join and form the cluster. This process is known as cluster bootstrap. Bootstrapping is an initial step to introduce a database node as primary component, before others see it as a reference point to sync up data.
How does it work?
When Galera starts with the bootstrap command on a node, that particular node will reach Primary state (check the value of wsrep_cluster_status). The remaining nodes will just require a normal start command and they will automatically look for existing Primary Component (PC) in the cluster and join to form a cluster. Data synchronization then happens through either incremental state transfer (IST) or …
[Read more]A Percona Xtradb (Galera) Cluster node may fail to join it due to many possible mistakes causing SST to fail. It could be a configuration item or purely setup requirement. In this article we will be troubleshooting step by step the SST issues faced.
The post Debugging Percona Xtradb (Galera) Cluster node startup / SST errors first appeared on Change Is Inevitable.
State Snapshot Transfer (SST) is used in Percona XtraDB Cluster (PXC) when a new node joins the cluster or to resync a failed node if Incremental State Transfer (IST) is no longer available. SST is triggered automatically but there is no magic: If it is not configured properly, it will not work and new nodes will never be able to join the cluster. Let’s have a look at a few classic issues.
Port for SST is not open
The donor and the joiner communicate on port 4444, and if the port is closed on one side, SST will always fail.
You will see in the error log of the donor that SST is started:
[...] 141223 16:08:48 [Note] WSREP: Node 2 (node1) requested state transfer from '*any*'. Selected 0 (node3)(SYNCED) as donor. 141223 16:08:48 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 6) 141223 16:08:48 …[Read more]
Background on Backup Locks
I was very excited to see Backup locks support in release notes for the latest Percona XtraDB Cluster 5.6.21 release. For those who are not aware, backup locks offer an alternative to FLUSH TABLES WITH READ LOCK (FTWRL) in Xtrabackup. While Xtrabackup can hot-copy Innodb, everything else in MySQL must be locked (usually briefly) to get a consistent snapshot that lines up with Innodb. This includes all other storage engines, but also things like table schemas (even on Innodb) and async replication binary logs. You can skip this lock, but it isn’t …
[Read more]HAProxy is frequently used as a software load balancer in the MySQL world. Peter Boros, in a past post, explained how to set it up with Percona XtraDB Cluster (PXC) so that it only sends queries to available nodes. The same approach can be used in a regular master-slaves setup to spread the read load across multiple slaves. However with MySQL replication, another factor comes into play: replication lag. In this case the approach mentioned for Percona XtraDB Cluster does not work that well as the check we presented only returns ‘up’ or ‘down’. We would like to be able to tune the weight of a replica inside HAProxy depending on its replication lag. This is what we will do in this post using HAProxy 1.5.
Agent …
[Read more]
Thanks to everyone who attended and participated in last week’s webinar on 'Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison'. If you missed the sessions or would like to watch the webinar again & browse through the slides, they are now available online.
In this webinar, Severalnines VP of Products, Alex Yu, who was part of the team at Ericsson who originally developed the NDB storage engine in 2001, gave an overview of the two clustering architectures and discussed their respective strengths and weaknesses:
- MySQL Cluster architecture: strengths and limitations
- Galera Architecture: strengths and limitations
- Deployment scenarios
- Data migration
- Read and write workloads (Optimistic/pessimistic locking)
- WAN/Geographical replication
- Schema changes
- Management and monitoring …
Introducing Consul
I’m always interested in what Mitchell Hashimoto and Hashicorp are up to, I typically find their projects valuable. If you’ve heard of Vagrant, you know their work.
I recently became interested in a newer project they have called ‘Consul‘. Consul is a bit hard to describe. It is (in part):
- Highly consistent metadata store (a bit like Zookeeeper)
- A monitoring system (lightweight Nagios)
- A service discovery system, both DNS and HTTP-based. (think of something like haproxy, but instead of tcp load balancing, it provides dns lookups with healthy services)
What this has to do with Percona XtraDB Cluster
I’ve had some more complex testing for …
[Read more]One new feature in Percona XtraDB Cluster (PXC) in recent releases was the inclusion of the ability for an existing cluster to auto-bootstrap after an all-node-down event. Suppose you lose power on all nodes simultaneously or something else similar happens to your cluster. Traditionally, this meant manually re-bootstrapping the cluster, but not any more.
How it works
Given the above all-down situation, if all nodes are able to restart and see each other such that they all agree what the state was and that all nodes have returned, then the nodes will make a decision that it is safe for them to recover PRIMARY state as a whole.
This requires:
- All nodes went down hard — that is; a kill -9, kernel panic, server power failure, or similar event
- All nodes from the last PRIMARY component are restarted …