Showing entries 31 to 40 of 99
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: partitioning (reset)
Partitioning the WordPress Comments Table

WordPress sites can get big. Really big. When you’re looking at a site of Cheezburger, Engadget or Techcrunch proportions, you get hundreds of comments per post, on dozens of posts per day, which adds up to millions of comments per year.

In order to keep your site running in top condition, you don’t want to be running queries against tables with lots of rarely accessed rows, which is what happens with most comments – after the post drops off the front page, readership drops, so the comments are viewed much less frequently. So, what we want to do is remove these old comments from the primary comment table, but keep them handy, for when people read the archives.

Enter partitioning.

The idea of MySQL partitioning is that it splits tables up into multiple logical tablespaces, based on your criteria. Running a query on a single partition of a large table is much faster than running it across the entire table, even with …

[Read more]
Replication and sandbox talks on the road - San Francisco, Santa Clara, Orlando

In a few weeks, I will be on the road, for an intense set of presentations in the USA.

San Francisco, April 7thI will start the tour at the San Francisco MySQL User Group. On April 7, at 6pm I will talk about Advanced MySQL replication for the masses. This talk will explore topics such as bi-directional replication, multiple sources, parallel replication, seamless failover, with the help of Tungsten replicator.

Santa Clara, April 11, 12The MySQL conference starts on Monday, April 11th, with several three-hour tutorials.

[Read more]
MySQL Partitioning: A Flow Chart

In Part 1, and Part 2 of this series, I presented some thoughts on partitioning. I heard some great feedback on why people use partitioning. Here, I present a flow chart that summarizes what I’ve learned. In summary: with TokuDB in the picture there’s almost no reason to use partitioning. Or I should say, there are almost always better (higher performing, more robust, lower maintenance) alternatives to partitioning.

Here goes:

  1. Spindle contention? In other words, are you partitioning in order to spread your query work load across many disks? I’ve yet to see a compelling technical case that RAIDing your disks doesn’t do this as well, with much less setup and maintenance.
[Read more]
Partitioning, Free Lunches, & Indexing, Part 2


In part one, I presented a very brief and particular view of partitioning. I covered what partitioning is, with hardly a mention of why one would use partitioning. In this post, I’ll talk about a few use cases often cited as justification for using partitions.

Lots of disks → Lots of partitioning of tables

One use case for justifying partitions is that each partition can be placed on a separate disk to avoid spindle contention. I have to say that on this one, I agree with Kevin Burton, who makes the point that if you want to distribute I/O load across several disks, you can use a RAID configuration on the disks. In this case, he says that partitioning is not worth the trouble. [NB. He makes the point that this …

[Read more]
Data Warehousing Best Practices: Comparing Oracle to MySQL pt 2

At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle’s partitioning looks very nice (then again, that’s why it costs so much I guess).

Partition – …

[Read more]
Using MySQL Partitioning Instead of MERGE Tables

One common question I get is how to use partitioning instead of MERGE tables. The process I use involves using stored procedures to create and drop partitions. This article will go over the stored procedures I use; special thanks to Roland Bouman for taking a look and giving great feedback to optimize this process.

First, a simple table, not partitioned (yet):

use test;
  id int NOT NULL,
  creationDate datetime NOT NULL,
  PRIMARY KEY (id,creationDate)

In real, life there is more to the table than just id and creationDate. The most important part is that the partitioned field(s) need to be part of the primary key.

Now, add the partition definition. This can be done in the CREATE statement, but I have found that it is easier for me to think …

[Read more]
Welcome googleCL
I am writing this blog post with Vim, my favorite editor, instead of using the online editor offered by blogger. And I am uploading this post to my Blogger account using Google CL a tool that lets you use Google services from the command line.
I am a command line geek, and as soon as I saw the announcement, I installed it in my laptop. The mere fact that you are reading this blog post shows that it works.

GoogleCL is an apparently simple application. If you install it on Mac using macports you realize how many dependencies it has and how much complexity it gives under the …

[Read more]
I wrote a new tool that runs aggregation queries over MySQL sharded databases using Gearman.

I created a new tool this week:

As the name Shard-Query suggests, the goal of the tool is to run a query over multiple shards, and to return the combined results together as a unified query. It uses Gearman to ask each server for a set of rows and then runs the query over the combined set. This isn't a new idea, however, Shard-Query is different than other Gearman examples I've seen, because it supports aggregation.

It does this by doing some basic query rewriting based on the input query.

Take this query for example:

select c2, 
 from t1 as s0 
 join t1 using (c1,c2) 
 where c2 = 98818 
 group by c2;

The tool will split this up into two queries.

This first query will be sent to each shard. Notice that …

[Read more]
Two quick performance tips with MySQL 5.1 partitions
While I was researching for my partitions tutorial, I came across two hidden problems, which may happen often, but are somehow difficult to detect and even more difficult to fix, unless you know what's going on, and why. I presented both cases during my tutorial, but there were no pictures to convey the mechanics of the problem. Here is the full story.

TO_DAYS() prunes two partitions instead of one
If you are partitioning by date, chances are that you are using TO_DAYS(). And depending on how you have partitioned your table, your queries are as fast as you expect them to be. However, there are cases where your query takes twice as long as it should, and of course this will not make you happy.

For example, in a table partitioned by month, when …

[Read more]
MySQL: Partition-wise backups with mysqldump

To whom it may concern,

in response to a query from André Simões (also known as ITXpander), I slapped together a MySQL script that outputs mysqldump commands for backing up individual partitions of the tables in the current schema. The script is maintained as a snippet at MySQL Forge. How it worksThe script works by querying the information_schema.PARTITIONS system view to …

[Read more]
Showing entries 31 to 40 of 99
« 10 Newer Entries | 10 Older Entries »