The purpose of this post is to describe what covering indexes are and how they can be used to improve the performance of queries. People mostly use indexes to filter or sort the results but not much thought is given to actually reduce the disk reads by using proper indexes. So I will show you how to reduce disk reads and hence improve the performance of queries by utilizing indexes properly.
Indexes can reduce the amount of data your query touches by orders of magnitude. This results in a proportional query speedup. So what happens when you define a nice set of indexes and you don’t get the performance pop you were expecting? Consider the following example:
mysql> show create table t; | t | CREATE TABLE `t` ( `a` varchar(255) DEFAULT NULL, `b` bigint(20) NOT NULL DEFAULT '0', `c` bigint(20) NOT NULL DEFAULT '0', `d` bigint(20) DEFAULT NULL, `e` char(255) DEFAULT NULL, PRIMARY KEY (`b`,`c`), KEY `a` (`a`,`b`,`d`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Now we’d like to perform the following query:
select sql_no_cache count(d) from t where a = 'this is a test' and b between 8000000 and 8100000;
Great! We have index a, which cover this query. Using a should be really fast. You’d expect to use the index to jump to the beginning of the ‘this is a test’ values for …[Read more]
An InnoDB table must have a primary key (one is created if you don’t do it yourself). You may have a natural key at hand. Stop! Allow me to suggest an AUTO_INCREMENT may be better.
Why should one add an AUTO_INCREMENT PRIMARY KEY on a table on which there’s a natural key? Isn’t an AUTO_INCREMENT a pseudo key, meaning, it doesn’t have any explicit relation to the row data, other than it is a number and unique?
Yes, indeed so. Nevertheless, consider:
- Natural keys are many times multi-columned.
- Multi column PRIMARY KEYs make for larger keys, and make for bloated secondary keys as well. You may be wasting space for storing the additional AUTO_INCREMENT column, but you may gain space back on secondary keys.
- Multi column PRIMARY KEYs make for more locks. See also …
The title of the talk is “How better indexes save you money”. Saving money? Hey sure thing :) I’m in Ronald.
For those of you who do not know Ronald Bradford, he’s an Oracle Ace Director in the MySQL field, a long time community contributor and a MySQL expert.
I hope to see you at 902 Broadway New York, NY on Tuesday 22nd March 6pm.
A simple question I’ve been asked:
Is it possible to merge two UPDATE queries, each on different WHERE conditions, into a single query?
For example, is it possible to merge the following two UPDATE statements into one?
mysql> UPDATE film SET rental_duration=rental_duration+1 WHERE rating = 'G'; Query OK, 178 rows affected (0.01 sec) mysql> UPDATE film SET rental_rate=rental_rate-0.5 WHERE length < 90; Query OK, 320 rows affected (0.01 sec)
To verify our tests, we take a checksum:
mysql> pager md5sum PAGER set to 'md5sum' mysql> SELECT film_id, title, rental_duration, rental_rate FROM film ORDER BY film_id; c2d253c3919efaa6d11487b1fd5061f3 -
Obviously, the following query is …[Read more]
The more I go through others SQL, there are some common mistakes that I see developers making over and over again, so I thought why not start a series of tips that can help developers optimize their queries and avoid common pitfalls. So this post is a part of that series of tips, and this is the first tip "Avoid using a wild card character at the start of a LIKE pattern".
Risking some flames, I’d like to suggest only two options for choosing PRIMARY KEYs for InnoDB tables. I suggest they should cover 99% (throwing numbers around) of cases.
PRIMARY KEY cases
- An integer (SMALLINT / INT / BIGINT), possibly AUTO_INCREMENT column.
- The combination of two columns on a many-to-many connecting table (e.g. film_actor, which connects films to actors), the two columns being the PRIMARY KEYs of respective data tables. This rule may be extended to 3-way relation tables.
A short recap: an InnoDB must have a PRIMARY KEY. It will pick one if you don’t offer it. It can pick a really bad UNIQUE KEY (e.g. website_url(255)) or make one up using InnoDB internal row ids. If you don’t have a good candidate, an …[Read more]
Here’s a few thoughts on current status and further possibilities for Facebook’s Online Schema Change (OSC) tool. I’ve had these thoughts for months now, pondering over improving oak-online-alter-table but haven’t got around to implement them nor even write them down. Better late than never.
The tool has some limitations. Some cannot be lifted, some could. Quoting from the announcement and looking at the code, I add a few comments. I conclude with a general opinion on the tool’s abilities.
“The original table must have PK. Otherwise an error is returned.”
This restriction could be lifted: it’s enough that …[Read more]
This post follows up on Baron’s How often should you use OPTIMIZE TABLE?. I had the opportunity of doing some massive purging of data from large tables, and was interested to see the impact of the OPTIMIZE operation on table’s indexes. I worked on some production data I was authorized to provide as example.
The use case
I’ll present a single use case here. The table at hand is a compressed InnoDB table used for logs. I’ve rewritten some column names for privacy:
mysql> show create table logs \G Create Table: CREATE TABLE `logs` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `origin` varchar(64) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, `message` text …[Read more]
Relational Database Index Design and the Optimizers
Relational Database Index Design and the Optimizers. By Tapio Lahdenmaki and Mike Leach, Wiley 2005. (Here’s a link to the publisher’s site).
I picked this book up on the advice of an Oracle expert, and after one of my colleagues had read it and mentioned it to me. The focus is on how to design indexes that will produce the best performance for various types of queries. It goes into quite a bit of detail on how databases execute specific types of queries, including sort-merge joins and multiple index access, and develops a generic cost model that can be used to produce a quick upper-bound estimate (QUBE) for the …[Read more]