Showing entries 1 to 10 of 19
9 Older Entries »
Displaying posts with tag: parallel (reset)
FOSDEM 2020

My post-FOSDEM detox has started - despite preparing by reading some survival guides, I hadn't really fathomed the variety and quantity (and quality) of beer that would flow over four days.  On reflection however, the beer flow has been far exceeded by the flow of tech content and conversation.

On Thursday and Friday I attended the pre-FOSDEM MySQL Days fringe event, where there were two tracks of talks and tutorials on MySQL including sessions on :
 - MySQL Server simplification
 - MySQL replication tooling improvements
 - Configuring group replication
 - Troubleshooting group replication
 - Using DNS for loadbalancing and failover
 - Upgrading to MySQL 8.0 …

[Read more]
Concurrent sandbox deployment


Version 0.3.0 of dbdeployer has gained the ability of deploying multiple sandboxes concurrently. Whenever we deploy a group of sandboxes (replication, multiple) we can use the --concurrent flag, telling dbdeployer that it should run operations concurrently.

What happens when a single sandbox gets deployed? There are six sets of operations:

  1. Create the sandbox directory and write down its scripts;
  2. Run the initialisation script;
  3. Start the database server;
  4. Run the pre-grants SQL commands (if any;)
  5. Load the grants;
  6. Run the post-grants SQL commands (if any;)

When several …

[Read more]
MySQL replication in action - Part 5 - parallel appliers

Previous episodes:

MySQL replication in action - Part 1: GTID & CoMySQL replication in action - Part 2 - Fan-in topologyMySQL replication in action - Part 3 - All-masters P2P topologyMySQL replication in action - Part 4 - star and hybrid topologies
Parallel replication overviewOne …

[Read more]
Parallel replication: off by one

One of the most common errors in development is where a loop or a retrieval by index falls short or long by one unit, usually because of an oversight or a logic in coding.

Of the following snippets, which one will run 10 times?

/* #1 */    for (N = 0 ; N < 10; N++) printf("%d\n", N);

/* #2 */ for (N = 0 ; N <= 10; N++) printf("%d\n", N);

/* #3 */ for (N = 1 ; N <= 10; N++) printf("%d\n", N);

/* #4 */ for (N = 1 ; N < 10; N++) printf("%d\n", N);

The question is deceptive, as there are two snippets that will run 10 times (1 and 3). But they will print different numbers. If you ware aiming for numbers from 1 to 10, only #3 is good.

After many years of programming, off-by-one errors are rare in my code, and I have been able to spot them or prevent them at first sight. That’s why I feel uneasy when I look at the way parallel replication is enabled in …

[Read more]
One billion

As always, I am a little late, but I want to jump on the bandwagon and mention the recent MySQL Cluster milestone of passing 1 billion queries per minute. Apart from echoing the arbitrarily large ransom demand of Dr Evil, what does this mean?

Obviously 1 billion is only of interest to us humans as we generally happen to have 10 fingers, and seem to name multiples in steps of 10^3 for some reason. Each processor involved in this benchmark is clocked at several billion cycles per second, so a single billion is not so vast or fast.

Measuring over a minute also feels unnatural for a computer performance benchmark - we are used to lots of things happening every second! A minute is a long time in silicon.

What's more, these …

[Read more]
Eventual Consistency in MySQL Cluster - using epochs




Before getting to the details of how eventual consistency is implemented, we need to look at epochs. Ndb Cluster maintains an internal distributed logical clock known as the epoch, represented as a 64 bit number. This epoch serves a number of internal functions, and is atomically advanced across all data nodes.

Epochs and consistent distributed state

Ndb is a parallel database, with multiple internal transaction coordinator components starting, executing and committing transactions against rows stored in different data nodes. Concurrent transactions only interact where they attempt to lock the same row. This design minimises unnecessary system-wide synchronisation, enabling linear scalability of reads and writes.

The stream of changes made to rows stored at a …

[Read more]
Some MySQL projects I think are cool - Shard-Query

I've already described Justin Swanhart's Flexviews project as something I think is cool. Since then Justin appears to have been working more on Shard-Query which I also think is cool, perhaps even more so than Flexviews.

On the page linked above, Shard-Query is described using the following statements :

"Shard-Query is a distributed parallel query engine for MySQL"
"ShardQuery is a PHP class which is intended to make working with a partitioned dataset easier" "ParallelPipelining - MPP distributed query engines runs fragments of queries in parallel, combining the results at the end. Like map/reduce except it speaks SQL directly."

The things I like from the above description :

  • Distributed …
[Read more]
Some MySQL projects I think are cool - HandlerSocket Plugin

The HandlerSocket project is described in Yoshinori Matsunobu's blog entry under the title 'Using MySQL as a NoSQL - A story for exceeding 750,000 qps on a commodity server'. It's a great headline and has generated a lot of buzz. Quite a few early commentators were a little confused about what it was - a new NoSQL system using InnoDB? A cache? In memory only? Where does Memcached come in? Does it support the Memcached protocol? If not, why not? Why is it called HandlerSocket?

Inspirations from Memcache may include the focus on simplicity, performance and a simple human readable protocol. As Yoshinori says, Kazuho Oku has already implemented a MySQLD-embedded Memcached server, no need to do it again. What's more, the Memcache protocol …

[Read more]
Four short links: 7 June 2011
  1. OMG Text -- a plugin for CSS framework Compass for directional text shadows. (via David Kaneda)
  2. Build a Cheap Bitcoin Mine -- some day it will be revealed that the act of generating a bitcoin token is helping the Russian mafia to crack nuclear missile launch codes and Afghan druglords built the Bitcoin system to destabilize the US dollar.
  3. Polycode -- a free, open-source, cross-platform framework for creative code. You can use it as a C++ API or as a standalone scripting language to get easy and simple access to accelerated 2D and 3D graphics, hardware shaders, sound and network …
[Read more]
Memory tuning fast paced ETL

Dear Kettle friends,

on occasion we need to support environments where not only a lot of data needs to be processed but also in frequent batches.  For example, a new data file with hundreds of thousands of rows arrives in a folder every few seconds.

In this setting we want to use clustering to use “commodity” computing resources in parallel.  In this blog post I’ll detail how the general architecture would look like and how to tune memory usage in this environment.

Clustering was first created around the end of 2006.  Back then it looked like this.

The master

This is the most important part of our cluster.  It takes care of administrating network configuration and topology.  It also keeps track of the state of dynamically added slave servers.

The master …

[Read more]
Showing entries 1 to 10 of 19
9 Older Entries »