I was playing around with an idea recently...
I wanted to find out, what is the lowest number of characters needed to satisfy 95% of the values in a column? 95% is to rule out outliers.
I plan on using this when I want to get a bit agressive with the indexes on a table that gets inserted to very often. But until now, I haven't had a good query to find it quickly.
So, I thought a bit and came up with the following query:
mysql> show create table filenames\G
*************************** 1. row ***************************
Create Table: CREATE TABLE `filenames` (
`id` int(11) NOT NULL,
I was playing around with an idea recently...
This is the title of my first blog post on MySQL Performance Blog. It deals with a customer case where the customer was facing a peculiar problem where the rows column in the EXPLAIN output of the query was totally off. The actual number of rows was 18 times more than the number of rows reported by MySQL in the output of EXPLAIN. Now this can be a real pain as MySQL uses “the number of rows” estimation to pick and choose indexes and it could really be picking up a wrong index simply because of the wrong estimate. You...
At next week’s NY Effective MySQL Meetup, I will give a talk: “Understanding Indexing: Three rules on making indexes around queries to provide good performance.” The meetup is 7 pm Tuesday, October 11th, and will be held at Hive at 55 (55 Broad Street, New York, NY). Thanks to host Ronald Bradford for the invitation.
Application performance often depends on how fast a query can respond and query performance almost always depends on good indexing. So one of the quickest and least expensive ways to increase application performance is to optimize the indexes. This talk presents three simple and effective rules on how to construct indexes around queries that result in …[Read more]
In my last post, we talked about the read/write tradeoff of indexing data structures, and some ways that people augment B-trees in order to get better write performance. We also talked about the significant drawbacks of each method, and I promised to show some more fundamental approaches.
We had two “workload-based” techniques: inserting in sequential order, and using fewer indexes, and two “data structure-based” techniques: a write buffer, and OLAP. Remember, the most common thing people do when faced with an insertion bottleneck is to use fewer indexes, and this kills query performance. So keep in mind that all our work on write-optimization is really work for read-optimization, in that write-optimized indexes are cheap enough that you can keep all the ones you need to get good read performance.
This guest post is from our friends at Percona. They’re
hosting Percona Live London from October 24-25,
2011. Percona Live is a two day summit with 100% technical
sessions led by some of the most established speakers in the
In the London area and interested in attending? We are giving away two free passes in the next few days. Watch our @tokutek twitter feed for a chance to win.
Did you know that the following query actually performs a JOIN? You can’t see it, but it’s there:
SELECT the_day, COUNT(*), SUM(clicks), SUM(cost) FROM ad_clicks_by_day WHERE the_day >= …[Read more]
Some indexing structures are write optimized in that they are better than B-trees at ingesting data. Other indexing structures are read optimized in that they are better than B-trees at query time. Even within B-trees, there is a tradeoff between write performance and read performance. For example, non-clustering B-trees (such as MyISAM) are typically faster at indexing than clustering B-trees (such as InnoDB), but are then slower at queries.
This post is the first of two about how to understand write optimization, what it means for overall performance, and what the difference is between different write-optimized indexing schemes. We’ll be talking about how to deal with workloads that don’t fit in memory—in particular, if we had our data in B-trees, only the internal nodes (perhaps not even all of them) would fit in memory.
As I’ve already said, there is a tradeoff between write and read …[Read more]
Archimedes, the first DBA
According to a recent MIT Sloan Management Review study, top performing organizations use analytics 5 times more than lower performers. That’s pretty astounding. And while we all know about the ocean/lake/waves/(your favorite water analogy) of Big Data we struggle with everyday, information is not knowledge. So how can we get insight from data? Recent articles from O’Reilly and HBR offered some …[Read more]
The summer’s end is rapidly approaching — in the next two weeks or so, most people will be settling back into work. Time to change your mindset, re-evaluate your skills and see if you are ready to go back from the picnic table to the database table.
With this in mind, let’s see how much folks can remember from the recent indexing talks my colleague Zardosht Kasheff gave (O’Reilly Conference, Boston, and SF MySQL Meetups). Markus Winand’s site “Use the Index, Luke!” (not to be confused with …[Read more]
Indexes are a very important part of databases and are used frequently to speed up access to particular data item or items. So before working with indexes, it is important to understand how indexes work behind the scene and what is the data structure that is used to store these indexes, because unless you understand the inner working of an index, you will never be able to fully harness its power.