Sarah Novotny recently posted InnoDB Tablespace Fragmentation – Find it and Fix it, so I thought I would share how I determine MySQL fragmentation. This works for MyISAM and InnoDB tables, and will work whether or not innodb_file_per_table is in use. The basic idea is to compare the size of the file(s) on disk with […]
You have about 14 more hours to submit session proposals to the O’Reilly MySQL conference, which is soliciting sessions about all open-source databases: PostgreSQL, CouchDB, Riak, Hadoop, Firebird, you name it. The last-minute proposals are coming fast and furious, as usual, and we have a great selection to choose from, but we need more!
Related posts:
- Postgres folks, consider the 2011 MySQL conference
- Submit your proposals for MySQL conference 2010
- …
I’d meant to finish writing this way back in July… but I failed at that. Now is a good time to talk about Rookie-O as my again new colleague Andrew Hutchings (Buy his and Sergei’s book on MySQL 5.1 Plugin Development!) just went through the same thing (but in London instead of Hong Kong) given by the same trainer (Hi Eddie!).
Rackspace is the second employer I’ve had that has some kind of new hire training (the first being Sun). I am, of course, not quite counting Salmiakki as new-hire training for MySQL (although I probably should). To quote from the …
[Read more]Monty Program along with various guests and our friends at the recently founded SkySQL recently wrapped up a meeting in Istanbul Turkey. Sadly it rained most days but we had a good time and got a lot done.
Besides learning how to say “Şerefe” which is “Cheers” in Turkish, we discussed the future of MariaDB, our companies goals, the MariaDB knowledge base and many other topics. Since I am the web guy, I lead the presentation / talk on the KB and thought I would share some highpoints of it with you.
If you don’t know, the KB is located at http://kb.askmonty.org. The goal is to provide one location for MariaDB and MySQL documentation along with a place to ask questions (and have developers actually answer them). To start we have both information licensed under the GPL about …
[Read more]Drizzle still has a number of quirks inherited from the MySQL Storage Engine API (e.g. BLOBs, row buffer, CREATE SELECT and lack of DDL transaction boundaries, key tuple format). One of the things we fixed a long time ago was to have proper methods for StorageEngines to be called for: startTransaction, startStatement, endStatement, commit and rollback.
If you’ve had to implement a transactional storage engine in MySQL you will be well aware of the pattern of “in every …
[Read more]Since I announced SlackDB a few weeks ago, I’ve had a number of questions and interesting conversations in response. I thought I would summarize the initial feedback and answer some questions to help clarify things. One of the biggest questions was “Isn’t this what Drizzle is doing?”, and the answer is no. They are both being designed for “the cloud” and speak the MySQL protocol, but they provide very different guarantees around consistency and high-availability. The simple answer is that SlackDB will provide true multi-master configurations through a deterministic and idempotent replication model (conflicts will be resolved via timestamps), where Drizzle still maintains transactions and ACID properties, which imply single master. Drizzle could add support for clustered configurations and distributed transactions (like the NDB storage engine), but writes would still happen on the …
[Read more]
Single-threaded apply is one of the big downsides of MySQL's
built-in replication, as Baron Schwartz pointed out a couple of days
ago. While a master can process dozens of updates at once,
slaves must apply them one after the other on a single
thread. Add in disk I/O, and the result is very slow
performance indeed. The obvious answer is parallel
apply, namely writing multiple non-conflicting updates to the
slave at once.
I have spent the last few months implementing parallel apply for
Tungsten 2.0, which we are now testing at customer sites.
In this article I would like to describe how Tungsten's parallel
apply works as well as some of the lessons that have become
apparent through the implementation.
There are a couple of big challenges in parallel apply. …
In my previous post I blogged about upgrading to MariaDB 5.2 RC. This is the first blog post where I will look at a new feature in MariaDB 5.2: Virtual Columns. This feature was originally contributed to MySQL by Andrey Zhakov in 2008. MariaDB is now the first to include it and ship it in an upcoming stable release. (I believe it is also included in Drizzle, at least Monty Taylor was working with Andrey on it some time ago.)
The MariaDB knowledgebase explains it in detail, but the basic idea is easy to explain:
…
[Read more]
TIMESTAMP and DATETIME is how dates and times, or temporal, data
is represented ion MySQL. These types are combined with a bunch
of operators on these types, such as ADDATE and TIME_FORMAT and
such. So is this enough to effectively manage temporal data then?
Nope, in my mind, far from it.
To begin with, time isn't such a simple type of data as we might
look at it. Is a date just a count of the number of seconds since
00:00:00 on Jan 1 1970 or something like that? And we then
convert it to a calendar and all is fine? Is time data really
linear in that sense? No, it's not. Dec 24 2010 is Christmas Eve
and when you get your Christmas gifts here in Sweden. Is this
significant? Also, it's a Friday. That this data is a Friday can
be calculated from the date or the seconds-since-jan-1-1970
counter, but that is a different thing. Looking at that counter,
you cannot see that there is no way to look at that number and
figure out that this day is …
There are several counters for disk IO in SHOW STATUS output. I want to know what causes IO to be done by InnoDB. I know the IO rate for my server from iostat output. How do I map that back to features in InnoDB?
The SHOW STATUS counters for InnoDB include:
* Innodb_data_fsyncs - number of fsync calls for data and log files
* Innodb_log_writes - number of writes to the log file. These use buffered IO.
* Innodb_os_log_fsyncs - number of fsync calls for the log file
* Innodb_pages_read - number of reads done for data files. These include single-page reads done by per-connection threads, reads done by the main background thread during purge and insert buffer merges and reads done by the background IO threads.
* Innodb_data_reads - number of reads for data and log files. …
[Read more]