Showing entries 21 to 26
« 10 Newer Entries
Displaying posts with tag: Jay Janssen (reset)
Finding a good IST donor in Percona XtraDB Cluster 5.6

Gcache and IST

The Gcache is a memory-based cache of recent Galera transactions that is local to each node in a cluster.  If a node leaves and rejoins the cluster, it can use the gcache from another node that stayed in the cluster (i.e., its donor node) to fetch the transactions it missed (IST) as opposed to doing a full state snapshot transfer (SST).  However, there are a few nuances that are not obvious to the beginner:

  • The Gcache is lost when a node restarts
  • The Gcache is fixed size and implemented as a LRU.  Once it is full, older transactions roll off.
  • Donor selection is made irregardless of the gcache state
  • If the given donor for a restarting node doesn’t have all transactions needed, a full SST (read: full backup) is done instead
  • Until recent developments, there was no way to tell what, precisely, was in the Gcache.

So, with (somewhat) …

[Read more]
Automatic replication relaying in Galera 3.x (available with PXC 5.6)

A decade ago MySQL folks were in love with the concept of a relay slave for MySQL high availability across data centers.  A relay is a single slave in a remote data center that receives replication from the global master and, in turn, replicates to all the other local slaves in that data center.  This saved a lot of bandwidth, especially back in the days before memcached when scaling reads meant lots of slaves.  Sending 20 copies of your replication stream cross-WAN gets expensive.

In Galera and Percona XtraDB Cluster (PXC), by default when a transaction commits on a given node it is sent to every other node in the cluster from that node.  That is, the actual writeset payload (the RBR events) are sent over the network to every other node, so the bandwidth to replicate is roughly:

<writeset size> * …
[Read more]
New wsrep_provider_options in Galera 3.x and Percona XtraDB Cluster 5.6

Now that Percona XtraDB Cluster 5.6 is out in beta, I wanted to start a series talking about new features in Galera 3 and PXC 5.6.  On the surface, Galera 3 doesn’t reveal a lot of new features yet, but there has been a lot of refactoring of the system in preparation for great new features in the future.

Galera vs MySQL options

wsrep_provider_options is a semi-colon separated list of key => value configurations that set low-level Galera library configuration.  These tweak the actual cluster communication and replication in the group communication system.  By contrast, other PXC global variables (like ‘wsrep%’) are set like other mysqld options and generally have more to do with MySQL/Galera …

[Read more]
Measuring Max Replication Throughput on Percona XtraDB Cluster with wsrep_desync

Checking throughput with async MySQL replication

Replication throughput is the measure of just how fast the slaves can apply replication (at least by my definition).  In MySQL async replication this is important to know because the single-threaded apply nature of async replication can be a write performance bottleneck.  In a production system, we can tell how fast the slave is currently running (applying writes), and we might have historical data to check for the most throughput ever seen, but that doesn’t give us a solid way of determining where we stand right NOW().

An old consulting trick to answer this question is to simply stop replicating on your slave for a minute, (usually just the SQL_THREAD), restart it and watch how long it takes to catch up.  We can also watch the slave thread apply rate during this interval to get a sense of just how many writes per second we can do and compare that with the normal rate …

[Read more]
Changing an async slave of a PXC cluster to a new Master

Async and PXC

A common question I get about Percona XtraDB Cluster is if you can mix it with asynchronous replication, and the answer is yes!  You can pick any node in your cluster and it can either be either a slave or a master just like any other regular MySQL standalone server (Just be sure to use log-slave-updates in both cases on the node in question!).  Consider this architecture:

However, there are some caveats to be aware of.  If you slave from a cluster node, there is no built in mechanism to fail that slave over automatically to another master node in your cluster.  You cannot assume that the binary log positions are the same on all nodes in your cluster (even if they start binary logging at the same time), so you can’t issue a CHANGE MASTER without knowing the proper binary log position to start at.

[Read more]
Is Synchronous Replication right for your app?

I talk with lot of people who are really interested in Percona XtraDB Cluster (PXC) and mostly they are interested in PXC as a high-availability solution.  But, what they tend not to think too much about is if moving from async to synchronous replication is right for their application or not.

Facts about Galera replication

There’s a lot of different facts about Galera that come into play here, and it isn’t always obvious how they will affect your database workload.  For example:

  • Transaction commit takes approximately the worst packet round trip time (RTT) between any two nodes in your cluster.
  • Transaction apply on slave nodes is still asynchronous from client commit (except on the original node where the transaction is committed)
  • Galera prevents writing conflicts to these pending transactions …
[Read more]
Showing entries 21 to 26
« 10 Newer Entries