Showing entries 1 to 10 of 31
10 Older Entries »
Displaying posts with tag: freesoftware (reset)
First steps with MariaDB Global Transaction ID

My previous writings were mostly teoretical, so I wanted to give a more practical example, showing the actual state of the current code. I also wanted to show how I have tried to make the feature fit well into the existing replication features, without requiring the user to enable lots of options or understand lots of restrictions before being able to use it.

So let us start! We will build the code from lp:~maria-captains/maria/10.0-mdev26, which at the time of writing is at revision

First, we start a master server on port 3310 and put a bit of data into it:

    server1> use test; …
[Read more]
More on global transaction ID in MariaDB

I got some very good comments/questions on my previous post on MariaDB global transaction ID, from Giuseppe and Robert (of Tungsten fame). I thought a follow-up post would be appropriate to answer and further elaborate on the comments, as the points they raise are very important and interesting.

(It also gives me the opportunity to explain more deeply a lot of interesting design decisions that I left out in the first post for the sake of brevity and clarity.)

On crash-safe slave

One of the things I really wanted to improve with global transaction ID is to make the replication slaves more crash safe with respect to their current replication state. This state is mostly persistently stored information about which event(s) were last executed on the slave, so that after a server restart the slave will know from which point in the master binlog(s) …

[Read more]
Global transaction ID in MariaDB

The main goal of global transaction ID is to make it easy to promote a new master and switch all slaves over to continue replication from the new master. This is currently harder than it could be, since the current replication position for a slave is specified in coordinates that are specific to the current master, and it is not trivial to translate them into the corresponding coordinates on the new master. Global transaction ID solves this by annotating each event with the global transaction id which is unique and universal across the whole replication hierarchy.

In addition, there are at least two other main goals for MariaDB global transaction ID:

  1. Make it easy to setup global transaction ID, and easy to provision a new slave into an existing replication hierarchy.
  2. Fully support …
[Read more]
Integer overflow

What do you think of this piece of C code?

  void foo(long v) {
    unsigned long u;
    unsigned sign;
    if (v < 0) {
      u = -v;
      sign = 1;
    } else {
      u = v;
      sign = 0;

Seems pretty simple, right? Then what do you think of this output from MySQL:

  mysql> create table t1 (a bigint) as select '-9223372036854775807.5' as a;
  mysql> select * from t1;
  | a                    |
  | -'..--).0-*(+,))+(0( | 

Yes, that is authentic output from older versions of MySQL. Not just the wrong number, the output is complete garbage! This is my all-time favorite MySQL bug#31799. It was caused by code like the above C snippet.

So can you spot what is wrong with the code? Looks pretty simple, does it not? But the title of this post …

[Read more]
Even faster group commit!

I found time to continue my previous work on group commit for the binary log in MariaDB.

In current code, a (group) commit to InnoDB does not less than three fsync() calls:

  1. Once during InnoDB prepare, to make sure we can recover the transaction in InnoDB if we crash after writing it to the binlog.
  2. Once after binlog write, to make sure we have the transaction in the binlog before we irrevocably commit it in InnoDB.
  3. Once during InnoDB commit, to make sure we no longer need to scan …
[Read more]
Tale of a bug

This is a tale of the bug lp:798213. The bug report has the initial report, and a summary of the real problem obtained after detailed analysis, but it does not describe the processes of getting from the former to the latter. I thought it would be interesting to document this, as the analysis of this bug was rather tricky and contains several good lessons.


The bug first manifested itself as a sporadic failure in one of our random query generator tests for replication. We run this test after all MariaDB pushes in our Buildbot setup. However, this failure had only occured twice in several months, so it is clearly a very rare failure.

The first task was to try to repeat the problem and get some more data in the form of binlog files and so on. Philip kindly helped with this, and …

[Read more]
The future of replication revealed in Istanbul

A very good meeting in Istanbul is drawing to an end. People from Monty Program, Facebook, Galera, Percona, SkySQL, and other parts of the community are meeting with one foot on the European continent and another in Asia to discuss all things MariaDB and MySQL and experience the mystery of the Orient.

At the meeting I had the opportunity to present my plans and visions for the future development of replication in MariaDB. My talk was very well received, and I had a lot of good discussions afterwards with many of the bright people here. Working from home in a virtual company, it means a lot to get this kind of inspiration and encouragement from others on occasion, and I am looking forward to continuing the work after an early flight to Copenhagen tomorrow.

The new interface for transaction coordinator plugins is what particularly interests me at the moment. The …

[Read more]
Dynamic linking costs two cycles

It turns out that the overhead of dynamic linking on Linux amd64 is 2 CPU cycles per cross-module call. I usually take forever to get to the point in my writing, so I thought I would change this for once :-)

In MySQL, there has been a historical tendency to favour static linking, in part because to avoid the overhead (in execution efficiency) associated with dynamic linking. However, on modern systems there are also very serious drawbacks when using static linking.

The particular issue that inspired this article is that I was working on MWL#74, building a proper shared library for the MariaDB embedded server. The lack of a proper in MySQL and MariaDB has caused no end of grief for packaging Amarok for the various Linux distributions. My patch …

[Read more]
Micro-benchmarking pthread_cond_broadcast()

In my work on group commit for MariaDB, I have the following situation:

A group of threads are going to participate in group commit. This means that one of the threads, called the group leader, will run an fsync() for all of them, while the other threads wait. Once the group leader is done, it needs to wake up all of the other threads.

The obvious way to do this is to have the group leader call pthread_cond_broadcast() on a condition that the other threads are waiting for with pthread_cond_wait():

  bool wakeup= false;
  pthread_cond_t wakeup_cond;
  pthread_mutex_t wakeup_mutex


  while (!wakeup)
    pthread_cond_wait(&wakeup_cond, &wakeup_mutex);
  // Continue processing after group commit …
[Read more]
MySQL/MariaDB replication: applying events on the slave side

Working on a new set of replication APIs in MariaDB, I have given some thought to the generation of replication events on the master server.

But there is another side of the equation: to apply the generated events on a slave server. This is something that most replication setups will need (unless they replicate to non-MySQL/MariaDB slaves). So it will be good to provide a generic interface for this, otherwise every binlog-like plugin implementation will have to re-invent this themselves.

A central idea in the current design for generating events is that we do not enforce a specific content of events. Instead, the API provides accessors for a lot of different information related to each event, allowing the plugin flexibility in choosing what to include in a …

[Read more]
Showing entries 1 to 10 of 31
10 Older Entries »