Showing entries 31 to 40 of 105
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: partitioning (reset)
The Query Cache and Partitions

Like others we were not satisfied with the fix for a bug in MySQL which caused the query cache and partitioning to not work reliably together. The bug, in simple terms, was that if the query cache was enabled and you used partitioned tables and if a partitioned table was using a transactional engine like InnoDB or XtraDB, the query cache could, under certain circumstances, return incorrect results.

Returning incorrect results is a definite, high-priority bug. However, the upstream fix was to disable all caching of queries from …

[Read more]
Heads up! No more query cache for partitioned tables as of MySQL 5.5.23.

A customer opened an issue recently to ask why the query cache wasn't working after he upgraded to MySQL 5.5.25. The reason really ended up surprising me.

As of MySQL 5.5.23, the Query Cache is disabled for partitioned tables!

This is a "fix" for bug #53775.

At first I thought perhaps the fix for the bug had resulted in the query cache being inadvertently disabled for partitioned tables, but the comments that go along with the commit make it pretty clear that disabling the query cache was the intended "fix". You can review the commit message and the code changed at revision 2661.803.1 in the MySQL Server 5.5 repository.

[Read more]
Overlooked MySQL 5.6 New Features – WL#5217

There’s a lot of great new features in MySQL 5.6 DMRs – almost too many to keep track of.  And while a lot of (justified) attention is given to the headline-grabbing features of 5.6 (memcached APIs! global transaction ids! improved PERFORMANCE_SCHEMA!), I’m often curious about the new features that don’t make as big a splash.  I thought I would look at one such new feature – WorkLog #5217.  I’m not telling you what this WorkLog is yet; test your knowledge of 5.6 features by seeing if you can figure it out from the following scenario.  Imagine the following table data:

mysql> SELECT * FROM p;
| a    | b    |
|    1 |    1 |
|   11 |    1 |
|   21 |    1 |
3 rows in set (0.00 sec)

OK, try to write an UPDATE statement that increments b for the row where a = 11.  Easy, right?

UPDATE p SET b = b+1 WHERE a = 11;

OK, now do it …

[Read more]
Benchmarking single-row insert performance on Amazon EC2

I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, but still the tool serves well for benchmark purposes.

OK, let’s start off with the configuration details.


First of all let me describe the EC2 instance type that I used.

EC2 Configuration

I chose m2.4xlarge instance as that’s the instance type with highest memory available, and memory is what really really matters.

High-Memory Quadruple Extra Large Instance
68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute …
[Read more]
MySQL Partitioning and its Confusing Syntax

While looking at partitioning I recently made a mistake which I guess can happen to others. Often this is due to not fully reading the documentation or scanning it too quickly and misunderstanding what’s being said.

So this post is to complain about the MySQL partitioning syntax and to warn others who may easily make the same mistake without realising.

First we probably need to ask why we are partitioning a table in the first place. The main reasons for this are I think:

  • to improve query performance
  • to reduce individual .ibd file sizes for large tables (if using innodb_file_per_table)

In my case I wanted to do both. I had a several tables which store a large number of rows (batches of data) based on an incremental batch number. One of these tables was around 40 GB and had about 500,000,000 rows in it.  When processing data in this table often all the data from a particular batch …

[Read more]
Proper handling of insert-mostly, select-recently datasets

Some kinds of large tables such as chat messages, blog entries, etc have the following characteristics.

* huge number of records, huge data and index size
* insert and select mostly
* select from only recent data
* select by secondary index (i.e. user_id)
* secondary index entries are inserted in random order

What are optimal ways to handle these tables? The below single large table does not perform well.

CREATE TABLE message (
body VARCHAR(255),
created DATETIME,

The cause of poor performance is a secondary index on user_id. user_id is inserted in random order. Index size grows, and sooner or later it will exceed RAM size. Once index size on user_id exceeds RAM size, inserting into message table causes massive random read disk i/o, …

[Read more]
Partitioning the WordPress Comments Table

WordPress sites can get big. Really big. When you’re looking at a site of Cheezburger, Engadget or Techcrunch proportions, you get hundreds of comments per post, on dozens of posts per day, which adds up to millions of comments per year.

In order to keep your site running in top condition, you don’t want to be running queries against tables with lots of rarely accessed rows, which is what happens with most comments – after the post drops off the front page, readership drops, so the comments are viewed much less frequently. So, what we want to do is remove these old comments from the primary comment table, but keep them handy, for when people read the archives.

Enter partitioning.

The idea of MySQL partitioning is that it splits tables up into multiple logical tablespaces, based on your criteria. Running a query on a single partition of a large table is much faster than running it across the entire table, even with …

[Read more]
Replication and sandbox talks on the road - San Francisco, Santa Clara, Orlando

In a few weeks, I will be on the road, for an intense set of presentations in the USA.

San Francisco, April 7thI will start the tour at the San Francisco MySQL User Group. On April 7, at 6pm I will talk about Advanced MySQL replication for the masses. This talk will explore topics such as bi-directional replication, multiple sources, parallel replication, seamless failover, with the help of Tungsten replicator.

Santa Clara, April 11, 12The MySQL conference starts on Monday, April 11th, with several three-hour tutorials.

[Read more]
MySQL Partitioning: A Flow Chart

In Part 1, and Part 2 of this series, I presented some thoughts on partitioning. I heard some great feedback on why people use partitioning. Here, I present a flow chart that summarizes what I’ve learned. In summary: with TokuDB in the picture there’s almost no reason to use partitioning. Or I should say, there are almost always better (higher performing, more robust, lower maintenance) alternatives to partitioning.

Here goes:

  1. Spindle contention? In other words, are you partitioning in order to spread your query work load across many disks? I’ve yet to see a compelling technical case that RAIDing your disks doesn’t do this as well, with much less setup and maintenance.
[Read more]
Partitioning, Free Lunches, & Indexing, Part 2


In part one, I presented a very brief and particular view of partitioning. I covered what partitioning is, with hardly a mention of why one would use partitioning. In this post, I’ll talk about a few use cases often cited as justification for using partitions.

Lots of disks → Lots of partitioning of tables

One use case for justifying partitions is that each partition can be placed on a separate disk to avoid spindle contention. I have to say that on this one, I agree with Kevin Burton, who makes the point that if you want to distribute I/O load across several disks, you can use a RAID configuration on the disks. In this case, he says that partitioning is not worth the trouble. [NB. He makes the point that this …

[Read more]
Showing entries 31 to 40 of 105
« 10 Newer Entries | 10 Older Entries »