I think MariaDB has had a great few weeks recently and the timeline of these events are important.
|Showing entries 1 to 28|
I think MariaDB has had a great few weeks recently and the timeline of these events are important.
I’ll be giving an online webinar for Drizzle contributors on Saturday, May 15th @ 1am GMT (In the U.S. this is Friday, May14th @ 9pm EDT, 6pm PDT).
Note that the DimDim widget below shows the time as May 14th @ 8pm. The widget is wrong, since DimDim does not account for daylight savings.
Space is strictly limited to 20 people and this will be done via DimDim.com. Please register for the webinar by entering your email address in the widget below and clicking “Sign Up”.
The agenda for this 2-3 hour tutorial will be:
So, last year, Drizzle participated in the Google Summer of Code under the MySQL project organization. We had four excellent student submissions and myself, Monty Taylor, Eric Day and Stewart Smith all mentored students for the summer. It was my second year mentoring, and I really enjoyed it, so I was looking forward to this year’s summer of code.
This year, Padraig O’Sullivan, a GSoC student last year, is now working at Akiban Technologies, partly on Drizzle, and is the GSoC Adminsitrator and also a mentor for Drizzle this year, and Drizzle is its own[Read more...]
Today I pushed up the initial patch which adds XA support to Drizzle’s transaction log. So, to give myself a bit of a rest from coding, I’m going to blog a bit about the transaction log and show off some of its features.WARNING: Please keep in mind that the transaction log module in Drizzle is under heavy development and should not be used in production environments. That said, I’d love to get as much feedback as possible on it, and if you feel like throwing some heavy data at it, that would be awesome
Simply put, the transaction log is a record of every modification to the state of the server’s data. It is similar to[Read more...]
Over the past six weeks or so, I have been working on cleaning up the pluggable storage engine API in Drizzle. I’d like to describe some of this work and talk a bit about the next steps I’m taking in the coming months as we roll towards implementing Log Shipping in Drizzle.
First, how did it come about that I started working on the storage engine API?
Well, it really goes back to my work on Drizzle’s replication system. I had implemented a simple, fast, and extensible log which stored records of the data changes made to a server. Originally, the log was called the Command Log, because the Google Protobuffer messages it contained were called[Read more...]
Although a few folks knew about where I and many of the Sun Drizzle team had ended up, we’ve waited until today to “officially” tell folks what’s up. We — Monty Taylor, Eric Day, Stewart Smith, Lee Bieber, and myself — are all now “Rackers”, working at Rackspace Cloud. And yep, we’re still workin’ on Drizzle. That’s the short story. Read on for the longer one
I left my previous position of Community Relations Manager at MySQL to begin working on Brian Aker‘s newfangled Drizzle project in October[Read more...]
I've been coding up a storm in the last couple days and have just about completed coding on three new INFORMATION_SCHEMA views which allow anyone to query the new Drizzle transaction log for information about its contents. I've also finished a new UDF for Drizzle called PRINT_TRANSACTION_MESSAGE() that prints out the Transaction message's contents in a easy-to-read format.
I don't have time for a full walk-through blog entry about it, so I'll just paste some output below and let y'all take a looksie. A later blog entry will feature lots of source code explaining how you, too, can easily add INFORMATION_SCHEMA views to your Drizzle plugins.
Below is the results of the following sequence of actions:[Read more...]
I’ve been coding up a storm in the last couple days and have just about completed coding on three new INFORMATION_SCHEMA views which allow anyone to query the new Drizzle transaction log for information about its contents. I’ve also finished a new UDF for Drizzle called PRINT_TRANSACTION_MESSAGE() that prints out the Transaction message’s contents in a easy-to-read format.
I don’t have time for a full walk-through blog entry about it, so I’ll just paste some output below and let y’all take a looksie. A later blog entry will feature lots of source code explaining how you, too, can easily add INFORMATION_SCHEMA views to your Drizzle plugins.
Below is the results of the following sequence of actions:
This week, I am working on putting together test cases which validate the Drizzle transaction log’s handling of BLOB columns.
I ran into an interesting set of problems and am wondering how to go about handling them. Perhaps the LazyWeb will have some solutions.
The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools. And, by client tools, I mean both everyone’s favourite little mysql command-line client, but also the mysqltest client, which provides infrastructure and runtime services for the MySQL and Drizzle test suites.
Even within the server and client protocol,[Read more...]
In this installment of my Drizzle Replication blog series, I'll be talking about the Transaction Log. Before reading this entry, you may want to first read up on the Transaction Message, which is a central concept to this blog entry.
The transaction log is just one component of Drizzle's default replication services, but it also serves as a generalized log of atomic data changes to a particular server. In this way, it is only partially related to replication. The transaction log is used by components of the replication services to store changes made to a server's data. However, there is nothing that mandates that this particular transaction log be a required feature for Drizzle replication systems. For instance,[Read more...]
The delay has been due to a reworking of the replication system to fully support "group commit" behaviour and to support fully transactional replication. The changes allow replicator and applier plugins to understand much more about the actual changes which occurred on the server, and to understand the transactional container properly.
The goals of Drizzle's replication system are as follows:
Sometimes, as Sergei rightly mentioned, I can be, well, "righteously indignant" about what I perceive to be a hack.
In this case, after Sergei repeatedly tried to set me straight about what was going on "under the covers" during a REPLACE operation, I was still arguing that he was incorrect.
I then realized that Sarah Sproenhle's original comment about my test table not having a primary key was the reason that I was seeing the behaviour that I had been seeing.
My original test case was failing, expecting to see a DELETE + an INSERT, when a REPLACE INTO was issued against a table. When I placed the PRIMARY KEY on the table in my[Read more...]
Yesterday, I posed a question to the ZanyWeb about what exactly a REPLACE statement does behind the scenes in the storage engine. There were many excellent comments and these comments exposed some misunderstandings (including some of my own misconceptions) about the REPLACE statement itself and what goes on behind the scenes in the storage engine.
The question I asked was this: if I execute the following statements in a client, what would you expect would happen behind the scenes in the storage engine?
CREATE TABLE t1 ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY , padding VARCHAR(200) NOT NULL ); INSERT INTO t1 VALUES (1, "I love testing."); INSERT INTO t1 VALUES (2, "I hate testing."); REPLACE INTO t1 VALUE (2, "I[Read more...]
This article is out of date and the replication API has been updated. Please see the follow-up article for the most up to date information! OK, so here is the next installment in the Drizzle replication article series. Today, I'll be talking about the flow of the Command message object through the CommandReplicator and CommandApplier APIs. If you missed the first article about the structure of the Command message and Google Protobuffers, you may want to read that first. We'll only be talking in this article about what happens on one server. We will be
This article is out of date and the replication API has been updated. Please see the follow-up article for the most up to date information! I wanted to start writing about how Drizzle's new replication system works, how its internals are structured, how logs are formatted, what are its (current) limitations, what are planned features, how you can get involved in development, and a lot more. Before jumping in, you may want to read a quick overview about the concepts of Drizzle replication here.
Fortunately, some advice from my friend Edwin DeSouza got me back to reality: "Jay, do a series of[Read more...]
Over the past week, I've been refactoring the way that the Drizzle kernel communicates with plugin modules that wish to implement functionality related to replication. There are many, many potholes in the current way that row-based replication works in Drizzle, and my refactoring efforts were solely focused on three things:
Let me expand on these two goals, and why they are critical to the success of a[Read more...]
Last night and today, I ran a series of benchmarks against Drizzle. These benchmarks were designed to isolate the performance improvement or regression from one change: using Eric Day's new libdrizzle client library instead of the legacy libdrizzleclient library from MySQL. The results are in, and they are stunning.
Here is a graph showing the difference between Drizzle sysbench on a readonly workload with the only difference being sysbench using the libdrizzle driver versus using the libdrizzleclient (libmysql) driver for sysbench:[Read more...]
We're making steady progress in removing bottlenecks in the Drizzle code base. So far, a number of mutexes have been removed and we've begin to replace a number of contention points with atomic instructions which remove the need for a lock structure on platforms which support atomic fetch and store instructions.
I'm pretty positive about the direction we are going so far. We're seeing the right trends in our scaling graphs, with very little performance drop off in read-only workloads up to 4X the number of cores on the machine, and little performance drop off on the read-write workloads up to 2X the number of cores, as you can see from the graphs below.[Read more...]
Yesterday, Monty and I were fussing around with lcov and genhtml trying to generate code coverage analysis for Drizzle. After a few hours, I was finally able to get some good output, and I've published the results temporarily on my website.
We're currently at 70.4% code coverage which is less-than-ideal, but at least we now have a baseline from which to improve. We're all about making incremental improvements, and having statistics to tell us whether we're going in the right direction is important. This is a good first step.
OK, so for those readers not familiar with gcov or lcov, here is what these code coverage numbers actually mean... They represent the percentage of executable source lines which are executed during a run of Drizzle's[Read more...]
Taking a little break from refactoring temporal data handling this evening, I decided to run some profiles against both Drizzle and MySQL 5.1.33. I profiled the two servers with callgrind (a valgrind tool/skin) while running the drizzleslap/mysqlslap test case. In both cases, I had to make a small change to the drizzled/tests/test-run.pl Perl script.
For the MySQL build, I used the BUILD/compile-amd64-debug-max build script. For Drizzle, I used my standard build process which builds Drizzle with maximum debugging symbols and hooks. It's worth noting that the debug and build process for MySQL and Drizzle are very different, and the MySQL debug build contains hooks to the DBUG library, which you'll notice appear on the MySQL call graphs and consume a lot of the overall function calls. You won't see this in the Drizzle graphs because we do not use DBUG.[Read more...]
The frustration builds.
I have come to despise MySQL's sql_mode. It is a hack of the most gargantuan proportions.
Basically, the optimizer just ignores the sql_mode whenever it is convenient for it to do so. More importantly, the optimizer silently ignores bad datetime input in various places. The reason for this is because of my statement above: sql_mode is a big ole' hack. Instead of fixing the runtime executor in MySQL to use real ValueObject types — that are immutable and know how to convert (and not convert) between each other, the runtime is a mess of checks for various runtime codes, warning modes, "count_cuted_field" crap and other miscellany that obfuscates the executor pipeline almost beyond recognition.
Slowly, I am attacking the mess, but the executor is so fragile that even tiny changes can wreak[Read more...]
Over the past few weeks, I have been happy working on Drizzle. Why have I been happy? Is it because of some new incredible code that will revolutionize the database industry? Nope. Is it because we've been able to remove all the issues that plague the server core? Nope. Is it because I see Drizzle quickly morphing into a modular, standard-conforming super-kernel? Nope.
So, why am I joyous?
To paraphrase the late Charlton Heston: "[Drizzle] is people!"
Recently, I've seen the fruit that transparent, open source development bears. This fruit takes the form of engaged, motivated, and humble individuals who wish to make their mark on a project.
Whether it's on IRC on #drizzle, the[Read more...]
Like I mentioned in my previous article, I've been working on refactoring the temporal data handling in Drizzle. The major problem I've been dealing with is poorly or non-documented code. The lack of documentation has led me to rely on the MySQL Manual in some cases, and then additional research and lastly, my own intuition as to what was going on.
One of the earliest cases of me saying "WTF?" was when I was investigating how day numbers were calculated. Here is the original, unmodified code from MySQL (/libmysql/my_time.c:746-778). I've highlighted in blue the massive amount of comments explaining the inner workings of the function and what it is doing.
/* Calculate nr of day since year 0 in new date-system (from 1615) SYNOPSIS calc_daynr() year Year (exact 4 digit year, no year conversions) month Month day[Read more...]
So, you want to compile Workbench for Linux, on Fedora 9. You need to install the following packages:
autoconf automake libtool libzip-devel libxml2-devel libsigc++20-devel libglade2-devel gtkmm24-devel mesa-libGLU-devel mysql-libs mysql mysql-devel uuid-devel lua-devel glitz-devel glitz-glx-devel pixman-devel pcre-devel libgnome-devel gtk+-devel pango-devel cairo
I feel I’m being too liberal with dependencies, but I’m not about to strip it, I just want to get it working first :)
You need to have ctemplate and ctemplate-devel installed from updates-testing-newkey (relevant koji build log).
By default, configure.in in Workbench looks for “google-ctemplate”, as opposed to just “ctemplate” as[Read more...]
In this second part of my Launchpad guidebook series, I'll be covering the code management and repository features of Launchpad.net. If you missed the first part of my series, go check it out and get established on Launchpad.net. Then pop back to this article to dive into the magic of http://code.launchpad.net. In this article, we'll cover the following aspects of the code management pieces of Launchpad:
This article explains how to set up a properly functioning C/C++ development environment on Linux. The article is aimed at developers interested in contributing to the Drizzle server project, but the vast majority of the content applies equally well to developers wishing to contribute to the MySQL server or any other open source project written in C/C++
IMPORTANT: This article doesn't get into any religious battles over IDEs or particular editors. IDEs and editors are what you use to edit code. What this article covers is the surrounding libraries, toolchain, and dependencies needed to get into the development or contirbution process. That said, go Vim.[Read more...]
SAVEPOINTs [Read more...]
I wrote previously about looking for a more powerful search solution, and I mentioned that Xapian wasn’t quite so convenient in indexing my data. I then chose to experiment with sphinx a little more, and proceeded to create a number of search engines and indexed a number of data sources in order to decide which direction to go. Unfortunately, while sphinx was convenient and still provides an excellent backend for basic search indexes, I’m revisiting Xapian once again based on it’s more-than-anticipated flexibility. I was brief in my explanation of Xapian however, and didn’t mention some of the more important and powerful aspects of it.
Xapian is primarily an API for search indexing/data retrieval. They do provide a handy utility called Omega (available here) for[Read more...]
|Showing entries 1 to 28|