There was an idea. An idea to make Vitess self-reliant. An idea to get rid of the friction between Vitess and external fault-detection-and-repair tools. An idea that gave birth to VTOrc… Both VTOrc and Orchestrator are tools for managing MySQL instances. If I were to describe these tools using a metaphor, I would say that they are kinda like the monitor of a class of students. They are responsible for keeping the MySQL instances in check and fixing them up in case they misbehave, just like how a monitor ensures that no mischief happens in the classroom.
10 Older Entries »
Aurora cluster promises a high availability solution and seamless failover procedure. However, how much is actually the downtime when a failover happens? And how proxySQL can help in minimizing the downtime ? A little sneak peek on the results ProxySQL achieves up to 25x less downtime and the impressive up to ~9800x less errors during unplanned failovers. How proxySQL achieves this:
- Less downtime
- “Queueing” feature when an instance in a hostgroup becomes unavailable.
So what is ProxySQL? ProxySQL is a middle layer between the database and the application. ProxySQL protects databases from high traffic spikes, prevents databases from having high number of connections due to the multiplexing feature and minimizes the impact during planned/unexpected failovers or crashes of DBs.
This blog will continue with measuring the impact of an unexpected …[Read more]
Since MySQL 8.0.22 there is a mechanism in asynchronous replication that makes the receiver automatically try to re-establish an asynchronous replication connection to another sender, in case the current connection gets interrupted due to the failure of the current sender.
The post Automatic connection failover for Asynchronous Replication first appeared on dasini.net - Diary of a MySQL expert.
Geo-scale MySQL – or how to build a global, multi-region MySQL cloud back-end capable of serving several hundred million player accounts
This blog introduces a series of blogs we’ll be publishing over the next few months that discuss a number of different customer use cases that our solutions support and that centre around achieving continuous MySQL operations with commercial-grade high availability (HA), geographically redundant disaster recovery (DR) and global scaling.
This first use case looks at a customer of ours who are a global gaming company with several hundred million world-wide player accounts.
What is the challenge?
How to reliably, and fast, cater to hundreds of millions of game players around the world? The challenge here is to serve a game application for a geographically-distributed audience; in other words, a pretty unique challenge.
It requires fast, local response times …[Read more]
The Question Recently, a customer asked us:
How would we manually move the relay role from a failing node to a slave in a Composite Tungsten Cluster passive site?
The Answer The Long and the Short of It
There are two ways to handle this procedure manually when the
switch command fails to work as expected. One
is short and reasonably automated, and the other is much more
detailed and manual.
Of course, the usual procedure is to just issue the
switch command in the passive cluster:
use west set policy maintenance switch set policy automatic
The below article describes what to do when the
switch command does not move the relay role to
Below is the list of cctrl commands that would be run for the basic, short version, which (aside from handling policy changes) is really only …[Read more]
We’re pleased to share our webinar “Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & MariaDB”, recorded live on Thursday, April 18th, 2019.
Our colleague Matt Lang walks you through a comparison of building a global, multi-region MySQL / MariaDB / Percona cloud back-end using AWS Aurora versus Continuent Tungsten.
If you’d like to find out how multi-region AWS Aurora deployments can be improved – then this webinar is for you!
We hope you enjoy it!
Thursday, April 18th at 10am PST / 1pm EST / 4pm BST / 5pm CEST
Recording: follow this link to watch
Slides: …[Read more]
In case you missed the Multimaster webinar recorded live on Thursday, March 28th, 2019:
Learn how NewVoiceMedia built a global, multi-region MySQL cloud back-end to support a high-volume cloud contact center.
Find out how to deploy Multimaster MySQL / MariaDB / Percona with the following design criteria:
- Geographically distributed, low-latency data
- Fast local response times for read & write traffic
- Full ACID compliance – atomic operations, guaranteed consistency, isolation, and durability
- Local rapid-failover, automated high availability
Director of Professional Services – EMEA/APAC, is based in the UK, and has over 20 …[Read more]
For those of you who missed the webinar, “Geo-Scale MySQL in AWS,” recorded live on Thursday, March 14th, 2019:
Learn how to build a global, multi-region MySQL / MariaDB / Percona cloud back-end capable of serving hundreds of millions of online multiplayer game accounts.
Find out how Riot Games serves a globally distributed audience with low-latency, fast response times for read traffic, rapid-failover automated high availability, simple administration, system visibility, and stability.
Eric M. Stone
COO at Continuent, is a veteran of fast-paced, large-scale enterprise environments with 35 years of Information Technology experience. With a focus on HA/DR, from building data centers and trading floors to world-wide deployments, Eric has …[Read more]
In Amazon space, any EC2 or Service instance can “disappear” at any time. Depending on which service is affected, the service will be automatically restarted. In EC2 you can choose whether an interrupted instance will be restarted, or left shutdown.
For an Aurora instance, an interrupted instance is always restarted. Makes sense.
The restart timing, and other consequences during the process, are noted in our post on Aurora Failovers.
Aurora Testing Limitations
As mentioned earlier, we love testing “uncontrolled” failovers. That is, we want to be able to pull any plug on any service, and see that the environment as a whole continues to do its job. We can’t do that with Aurora, because we can’t control the essentials:
- power button;
- reset switch; …
Right now Aurora only allows a single master, with up to 15 read-only replicas.
We love testing failure scenarios, however our options for such tests with Aurora are limited (we might get back to that later). Anyhow, we told the system, through the RDS Aurora dashboard, to do a failover. These were our observations:
Role Change Method
Both master and replica instances are actually restarted (the MySQL uptime resets to 0).
This is quite unusual these days, we can do a fully controlled role change in classic asynchronous replication without a restart (CHANGE MASTER TO …), and Galera doesn’t have read/write roles as such (all instances are technically writers) so it doesn’t need role changes at all.
Failover between running instances takes about 30 seconds. This is in line with information provided in the …[Read more]
10 Older Entries »