Showing entries 31 to 40 of 47
« 10 Newer Entries | 7 Older Entries »
Displaying posts with tag: indexing (reset)
Query Planner Gotchas

Indexes can reduce the amount of data your query touches by orders of magnitude. This results in a proportional query speedup. So what happens when you define a nice set of indexes and you don’t get the performance pop you were expecting? Consider the following example:

mysql> show create table t;
| t     | CREATE TABLE `t` (
  `a` varchar(255) DEFAULT NULL,
  `b` bigint(20) NOT NULL DEFAULT '0',
  `c` bigint(20) NOT NULL DEFAULT '0',
  `d` bigint(20) DEFAULT NULL,
  `e` char(255) DEFAULT NULL,
  PRIMARY KEY (`b`,`c`),
  KEY `a` (`a`,`b`,`d`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Now we’d like to perform the following query:

select sql_no_cache count(d) from t where a = 'this is a test' and b between 8000000 and 8100000;

Great! We have index a, which cover this query. Using a should be really fast. You’d expect to use the index to jump to the beginning of the ‘this is a test’ values for …

[Read more]
EffectiveMySQL Meetup in NY

The first EffectiveMySQL meetup will be held in NY on Tuesday 22nd March 2011 by Ronald Bradford. Details here

The title of the talk is “How better indexes save you money”. Saving money? Hey sure thing :) I’m in Ronald.

For those of you who do not know Ronald Bradford, he’s an Oracle Ace Director in the MySQL field, a long time community contributor and a MySQL expert.

I hope to see you at 902 Broadway New York, NY on Tuesday 22nd March 6pm.

MySQL Query Optimization – Tip # 1 – Avoid using wildcard character at the start of a LIKE pattern.

The more I go through others SQL, there are some common mistakes that I see developers making over and over again, so I thought why not start a series of tips that can help developers optimize their queries and avoid common pitfalls. So this post is a part of that series of tips, and this is the first tip "Avoid using a wild card character at the start of a LIKE pattern".

A review of Relational Database Design and the Optimizers by Lahdenmaki and Leach

Relational Database Index Design and the Optimizers

Relational Database Index Design and the Optimizers. By Tapio Lahdenmaki and Mike Leach, Wiley 2005. (Here’s a link to the publisher’s site).

I picked this book up on the advice of an Oracle expert, and after one of my colleagues had read it and mentioned it to me. The focus is on how to design indexes that will produce the best performance for various types of queries. It goes into quite a bit of detail on how databases execute specific types of queries, including sort-merge joins and multiple index access, and develops a generic cost model that can be used to produce a quick upper-bound estimate (QUBE) for the …

[Read more]
Databases: Normalization or Denormalization. Which is the better technique?

This has really been a long debate as to which approach is more performance orientated, normalized databases or denormalized databases. So this article is a step on my part to figure out the right strategy, because neither one of these approaches can be rejected outright. I will start of by discussing the pros and cons of both the approaches. Pros and Cons of a Normalized database design. Normalized databases fair very well under conditions where the applications are write-intensive and the write-load is more than the read-load. This is because of the following reasons: Normalized tables are usually smaller and...

How (not) to find unused indexes

I've seen a few people link to an INFORMATION_SCHEMA query to be able to find any indexes that have low cardinality, in an effort to find out what indexes should be removed.  This method is flawed - here's the first reason why:

PLAIN TEXT SQL:

  1. CREATE TABLE `sales` (
  2. `id` int(11) NOT NULL AUTO_INCREMENT,
  3. `customer_id` int(11) DEFAULT NULL,
  4. `status` enum('archived','active') DEFAULT NULL,
  5. PRIMARY KEY (`id`),
  6. KEY `status` (`status`)
  7. ) ENGINE=MyISAM AUTO_INCREMENT=65691 DEFAULT CHARSET=latin1;
  8.  
  9. mysql> SELECT count(*), STATUS FROM sales GROUP BY STATUS;
  10. +----------+---------+
  11. | count(*) | STATUS  |
  12. +----------+---------+
  13. |    65536 | archived |
[Read more]
Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx – Fight!)

In the past few weeks I've been implementing advanced search at Plaxo, working quite closely with Solr enterprise search server. Today, I saw this relatively detailed comparison between Solr and its main competitor Sphinx (full credit goes to StackOverflow user mausch who had been using Solr for the past 2 years). For those still confused, Solr and Sphinx are similar to MySQL FULLTEXT search, or for those even more confused, think Google (yeah, this is a bit of a stretch, I know).

Similarities

  • Both Solr and Sphinx satisfy all of your requirements. They're fast and designed to index and search large bodies of data efficiently.
  • Both have a long list of high-traffic sites …
[Read more]
Better Primary Keys, a Benefit to TokuDB’s Auto Increment Semantics

In our last post, Bradley described how auto increment works in TokuDB. In this post, I explain one of our implementation’s big benefits, the ability to combine better primary keys with clustered primary keys.

In working with customers, the following scenario has come up frequently. The user has data that is streamed into the table, in order of time. The table will have a primary key that is an auto increment field, ‘id’, and then have an index on the field ‘time’. The queries the user does are all on some range of time (e.g. select sum(clicks) from foo where time > date ‘2008-12-19’ and time < date '2008-14-20';).

For storage engines with clustered primary keys (such as TokuDB and InnoDB), having such a schema hurts query performance. Queries do a range query on a secondary index (time), and then perform point …

[Read more]
MySQL 5.1 Grammar Changes to Support Clustering Indexes

This post is for storage engine developers that may be interested in implementing multiple clustering keys.

After blogging about TokuDB’s multiple clustering indexes feature, Baron Schwartz suggested we contribute the patch to allow other storage engine to implement the feature. We filed a feature request to MySQL to support this, along with a proposed patch. The patch, along with known issues, can be found here.

What the patch contains:

This patch has the changes necessary to introduce the new grammar for clustering indexes, and to tell the storage engine what indexes are defined as clustering. With this patch, all indexes that are defined as clustering …

[Read more]
How clustering indexes sometimes help UPDATE and DELETE performance

I recently posted a blog entry on clustering indexes, which are good for speeding up queries.  Eric Day brought up the concern that clustering indexes might degrade update performance. This is often true, since any update will require updating the clustering index as well.

However, there are some cases in TokuDB for MySQL, where the opposite is true: clustering indexes can drastically improve the performance of updates and deletions.  Consider the following analysis and example.

Updates and deletions generally have two steps. First, a query is done to find the necessary rows (the where clause in the statement), and then the rows are modified: they are deleted or updated, as the case may be. So the total time to do a deletion or an update is

T_total = T_query + T_change

Eric noted that …

[Read more]
Showing entries 31 to 40 of 47
« 10 Newer Entries | 7 Older Entries »