Showing entries 11 to 20 of 28
« 10 Newer Entries | 8 Older Entries »
Displaying posts with tag: C/C++ (reset)
Drizzle Replication - Changes in API to support Group Commit

Hi all. It's been quite some time since my last article on the new replication system in Drizzle. My apologies for the delay in publishing the next article in the replication series.

The delay has been due to a reworking of the replication system to fully support "group commit" behaviour and to support fully transactional replication. The changes allow replicator and applier plugins to understand much more about the actual changes which occurred on the server, and to understand the transactional container properly.

The goals of Drizzle's replication system are as follows:

  • Make replication modular and not dependent on one particular implementation
[Read more]
Yet Another Post on REPLACE

Sometimes, as Sergei rightly mentioned, I can be, well, "righteously indignant" about what I perceive to be a hack.

In this case, after Sergei repeatedly tried to set me straight about what was going on "under the covers" during a REPLACE operation, I was still arguing that he was incorrect.

Doh.

I then realized that Sarah Sproenhle's original comment about my test table not having a primary key was the reason that I was seeing the behaviour that I had been seeing.

My original test case was failing, expecting to see a DELETE + an INSERT, when a REPLACE INTO was issued against a table. When I placed the PRIMARY KEY on the table in my test case and re-ran the test case, it still …

[Read more]
The Deal with REPLACE .. Or Is It UPDATE?

Yesterday, I posed a question to the ZanyWeb about what exactly a REPLACE statement does behind the scenes in the storage engine. There were many excellent comments and these comments exposed some misunderstandings (including some of my own misconceptions) about the REPLACE statement itself and what goes on behind the scenes in the storage engine.

The question I asked was this: if I execute the following statements in a client, what would you expect would happen behind the scenes in the storage engine?

CREATE TABLE t1 (
  id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
, padding VARCHAR(200) NOT NULL
);

INSERT INTO t1 VALUES (1, "I love testing.");
INSERT INTO t1 VALUES (2, "I hate testing.");

REPLACE INTO t1 VALUE (2, "I love testing.");

Based purely on the manual, one would expect, as …

[Read more]
Drizzle Replication - The CommandReplicator and CommandApplier Plugin API

IMPORTANT:
This article is out of date and the replication API has been updated. Please see the follow-up article for the most up to date information! OK, so here is the next installment in the Drizzle replication article series. Today, I'll be talking about the flow of the Command message object through the CommandReplicator and CommandApplier APIs. If you missed the first article about the structure of the Command message and Google Protobuffers, you may want to read that first. We'll only be talking in this article about what happens on one server. We will be discussing the Command Log in the next article, and then discuss how messages are passed from one …

[Read more]
Drizzle Replication - The Command Message

IMPORTANT:
This article is out of date and the replication API has been updated. Please see the follow-up article for the most up to date information! I wanted to start writing about how Drizzle's new replication system works, how its internals are structured, how logs are formatted, what are its (current) limitations, what are planned features, how you can get involved in development, and a lot more. Before jumping in, you may want to read a quick overview about the concepts of Drizzle replication here.

Fortunately, some advice from my friend Edwin DeSouza got me back to reality: "Jay, do a series of small, targeted, easily digestible blog posts". And, so, this …

[Read more]
Towards a New Modular Replication Architecture

Over the past week, I've been refactoring the way that the Drizzle kernel communicates with plugin modules that wish to implement functionality related to replication. There are many, many potholes in the current way that row-based replication works in Drizzle, and my refactoring efforts were solely focused on three things:

  • Make an interface for replicating actions which occur inside a server that is clear and simple to understand for the caller or the interface
  • Make an interface that uses only documented data structures and standardized containers.
  • Completely remove the notion that logging is tightly-coupled with replication.

Let me expand on these two goals, and why they are critical to the success of a replication architecture.

Simple, Clear Interfaces Designed for the Interface Caller …

[Read more]
Libdrizzle Benchmarks - Massive Performance Increases

Last night and today, I ran a series of benchmarks against Drizzle. These benchmarks were designed to isolate the performance improvement or regression from one change: using Eric Day's new libdrizzle client library instead of the legacy libdrizzleclient library from MySQL. The results are in, and they are stunning.

Here is a graph showing the difference between Drizzle sysbench on a readonly workload with the only difference being sysbench using the libdrizzle driver versus using the libdrizzleclient (libmysql) driver for sysbench:

As you can see, with libdrizzle, the throughput is dramatically increased, with Drizzle scaling to 8x the number of cores on the benchmark machine before a …

[Read more]
Small but steady progress in improving Drizzle performance

We're making steady progress in removing bottlenecks in the Drizzle code base. So far, a number of mutexes have been removed and we've begin to replace a number of contention points with atomic instructions which remove the need for a lock structure on platforms which support atomic fetch and store instructions.

I'm pretty positive about the direction we are going so far. We're seeing the right trends in our scaling graphs, with very little performance drop off in read-only workloads up to 4X the number of cores on the machine, and little performance drop off on the read-write workloads up to 2X the number of cores, as you can see from the graphs below.

It's a little difficult to see, but we've made a small but steady improvement from r950 to r968, with numbers increasing around 1-2% across most concurrency levels. You can see the raw numbers here:

+--------------------------------+-------+-----+---------+----------+
| …
[Read more]
LCOV Code Coverage Pages for Drizzle

Yesterday, Monty and I were fussing around with lcov and genhtml trying to generate code coverage analysis for Drizzle. After a few hours, I was finally able to get some good output, and I've published the results temporarily on my website.

We're currently at 70.4% code coverage which is less-than-ideal, but at least we now have a baseline from which to improve. We're all about making incremental improvements, and having statistics to tell us whether we're going in the right direction is important. This is a good first step.

So, what exactly do these code coverage numbers mean?

OK, so for those readers not familiar with gcov or lcov, here is what these code coverage numbers actually mean... They represent the percentage of executable source lines which are executed during a run of Drizzle's test suite. Basically, the percent gives us a rough idea of the percent of …

[Read more]
A Better Parser Needed?

Taking a little break from refactoring temporal data handling this evening, I decided to run some profiles against both Drizzle and MySQL 5.1.33. I profiled the two servers with callgrind (a valgrind tool/skin) while running the drizzleslap/mysqlslap test case. In both cases, I had to make a small change to the drizzled/tests/test-run.pl Perl script.

For the MySQL build, I used the BUILD/compile-amd64-debug-max build script. For Drizzle, I used my standard build process which builds Drizzle with maximum debugging symbols and hooks. It's worth noting that the debug and build process for MySQL and Drizzle are very different, and the MySQL debug build contains hooks to the DBUG library, which you'll notice appear on the MySQL call graphs and consume a lot of the overall function calls. You won't see this in the Drizzle graphs because we do not use DBUG. For all intents and purposes, just ignore the calls to anything in the DBUG library …

[Read more]
Showing entries 11 to 20 of 28
« 10 Newer Entries | 8 Older Entries »