Showing entries 1 to 8
Displaying posts with tag: clustered indexes (reset)
Slides of my talk on B+Tree Indexes and InnoDB

The slides of my talk on B+Tree Indexes and InnoDB are now available for download. This slide was presented during Percona Live London 2011. You can download the slides from here.
There are many other interesting and informative talks that were presented during Percona Live London 2011, and I think you should definitely check them out, if you haven't. They are available here.

The post Slides of my talk on B+Tree Indexes and InnoDB appeared first on ovais.tariq.

A few notes on InnoDB PRIMARY KEY

InnoDB uses an index-organized data storage technique, wherein the primary key acts as the clustered index and this clustered index holds the data. Its for this reason that understanding the basics of InnoDB primary key is very important, and hence the need for these notes.

Understanding InnoDB clustered indexes

Some people don't probably know, but there is a difference between how indexes work in MyISAM and how they work in InnoDB, particularly when talking from the point of view of performance enhancement. Now since, InnoDB is starting to be widely used, it is important we understand how indexing works in InnoDB. Hence, the reason for this post!

Why do I recommend switching over from MyISAM to Innodb!

Although MyISAM has been the default storage engine for MySQL but its soon going to change with the release of MySQL server 5.5. Not only that, more and more people are shifting over to the Innodb storage engine and the reasons for that is the tremendous benefits, not only in terms of performance, concurrency, ACID-transactions, foreign key constraints, but also because of the way it helps out the DBA with hot-backups support, automatic crash recovery and avoiding data inconsistencies which can prove to be a pain with MyISAM. In this article I try to hammer out the reasons why you should move on to using Innodb instead of MyISAM.

A review of Relational Database Design and the Optimizers by Lahdenmaki and Leach

Relational Database Index Design and the Optimizers

Relational Database Index Design and the Optimizers. By Tapio Lahdenmaki and Mike Leach, Wiley 2005. (Here’s a link to the publisher’s site).

I picked this book up on the advice of an Oracle expert, and after one of my colleagues had read it and mentioned it to me. The focus is on how to design indexes that will produce the best performance for various types of queries. It goes into quite a bit of detail on how databases execute specific types of queries, including sort-merge joins and multiple index access, and develops a generic cost model that can be used to produce a quick upper-bound estimate (QUBE) for the …

[Read more]
How often should you use OPTIMIZE TABLE?

Many times I’ve heard people advise on “best practices” for a MySQL database. This often includes routine maintenance, such as “you should run OPTIMIZE TABLE on all of your InnoDB tables once a week to defragment them for better performance.”

But this advice is unsubstantiated and could even be detrimental. Here are some of the obvious problems that I can think of:

  • The optimized table compacts the primary key (clustered index) to its default 15/16ths fill factor per page. But other indexes will be built in pseudo-random order and are likely to end up just as fragmented afterwards as before. Which indexes are more important for performance? Maybe the primary key is just a dummy value that’s not even used, and the secondary indexes are the ones that would benefit from compacting.
  • Suppose the primary key is the important one, and SELECT queries will perform more quickly if it’s defragmented. Why does it …
[Read more]
Extended covering indexes

As you can probably guess, I’m catching up on reading my blogs. I’ve just read with interest about TokuDB’s multiple clustering indexes. It’s kind of an obvious thought, once someone has pointed it out to you. I’ve only been around products that insist there can be Only One clustered index (and then there’s ScaleDB, who say “think differently already”).

Anyway, we already know that there are quite a few database products that use clustered indexes and to avoid update overhead, require every non-clustered index to store the clustered key as the “pointer” for row lookups. Thus there are “hidden columns” which are present at the leaf nodes, but not the non-leaf nodes, of secondary indexes. Why not take that idea and run with it a little? Here’s what I mean:

[Read more]
Clustered indexing and query performance

Last time I showed where partitioning could negatively impact performance, with one partitioned query being four times slower than a non-partitioned one when the data was partitioned by the same column as it was clustered by.  This time I’m going to show a way to get better performance by selecting a good clustered index. With the InnoDB, the create table primary key syntax encourages one to create the clustered index the same as the primary key. For transaction systems, in many cases, this makes sense.  But there are times, particularly for reporting systems, when this isn't advisable. 

To demonstrate this two similar tables will be created where the only difference is the indexing.  The below SQL shows an one of these tables, a 20 gig, 120 million rows tables, representing one year (about 10 million per month) of data.  This table is clustered by the primary key. 

create table SaleOrderCluster ( …

[Read more]
Showing entries 1 to 8