Showing entries 41 to 49
« 10 Newer Entries
Displaying posts with tag: Query Optimization (reset)
Picking Up Where You Left Off??

I started this as a response to Keith Murphy’s post at http://www.paragon-cs.com/wordpress/?p=54, but it got long, so it deserves its own post. The basic context is figuring out how not to cause duplicate information if a large INSERT statement fails before finishing.
Firstly, the surefire way to make sure there are no duplicates if you have a unique (or primary) key is to use INSERT IGNORE INTO.
Secondly, I just experimented with adding an index to an InnoDB table that had 1 million rows, and here’s what I got (please note, this is one experience only, the plural of “anecdote” is *not* “data”; also I did this in this particular order, so there may have been caching taking place):  More »

Top 10 MySQL Best Practices

So, O’Reilly’s ONLamp.com has published the “Top 10 MySQL Best Practices” at http://www.onlamp.com/pub/a/onlamp/2002/07/11/MySQLtips.html. Sadly, I find most “best practice” list do not thoroughly explain the “why” enough so that people can make their own decisions.
For instance, #3 is “Protect the MySQL installation directory from access by other users.” I was intrigued at what they would consider the “installation” directory. By reading the tip, they actually mean the data directory. They say nothing of the log directory, nor that innodb data files may be in different places than the standard myisam data directories.  More »

Progress report on High Performance MySQL, Second Edition

It's been a while since I've written about progress on the book. I actually stopped working on it as much at the beginning of the month, because on October 31(st) I managed to finish a first draft of the last big chapter! Now I'm back to full-time work at my employer, and I'm working on the book in the evenings and weekends only. Read on for details of what I've been working on and what's next in the pipeline.

Making Queries 45-90 Times Faster!!

aka…..”when good queries go bad!”
So, today the developers were debugging why a script was running much longer than expected. They were doing text database inserts, and got to the point where they realized that double the amount of text meant the queries took double the amount of time.
You see, they were doing similar text inserts over and over, instead of using connection pooling and/or batching them. Apparently the other DBA explained that it was a limitation of MySQL, but either the developers didn’t convey what they were doing well, or the DBA didn’t think to mention batching.  More »

Selectivity and Index Performance

Let’s look into selectivity, as this is an important topics when looking at index performance. (Oooh, I said “performance”, watch everyone’s ears perk up!).

This will probably answer the questions “Why isn’t MySQL using my index?” or “Why is my query so slow when I have an index on it?”

Selectivity describes how different values of a field are. It is a number from 0-1, although you can also think of it as a percentage. A value of 1, or 100%, means that each value in the field is unique. This happens with UNIQUE and PRIMARY keys, although non-unique fields may have a selectivity of 1 — for example, a timestamp value in a not-often-used table.

To calculate this, you take the total number of DISTINCT records and divide by the total number of records.

My company has a large Users table, so I grabbed some statistics off of that:


+----------+
| …

[Read more]
The Care and Feeding of MySQL Tables

Our site went from weekly crashes during our two busiest nights to not even peeping this week (during the two busiest nights), and the only thing we changed was that we did some table maintenance. We hadn’t done table maintenance at least as long as I’ve been around, which is 6 months. We are a site with high volumes of both reads and writes. This article will talk about the care and feeding of tables; feel free to use this for justification to have a maintenance window, or even permission to run table maintenance statements.

MySQL uses a cost-based optimizer to best translate the written query into what actually happens. This means when you write:

SELECT foo FROM t1 INNER JOIN t2 USING (commonField);

The optimizer looks at the statistics for tables t1 and t2 and decides which is better:
1) To go through each item in t1, looking for a matching “commonField” in t2
or
2) To go …

[Read more]
One thing MERGE Tables Are Good For

Many people have some kind of reporting or auditing on their database. The problem is that the data grows very large, and lots of times there is data that can be purged. Sure, theoretically one never needs to purge data, but sometimes a “delete” flag just won’t work — when you search on the delete flag, a full table scan may be the most efficient way to go.

Of course, that’s not acceptable. And in many cases, say when you have users who no longer use the site but did in the past (and perhaps have billing data associated with them), you never want to get rid of them.

So what to do? Make a special reporting database, that gathers information from the production database(s). Use MyISAM tables, because a reporting server can afford to be behind the master, and MyISAM is better for reporting — better metadata. For something like a “Users” table, make 2 more tables:

1) DeletedUsers
2) AllUsers

[Read more]
Working Smarter, Not Harder (SET and ENUM)

So, in a previous post I talked about smart code. Today I put myself square into a discussion about ENUM and SET. ENUM is an enumerated list of values; similar to a pull-down menu, where the only values allowed in that field are the ones defined, with the option of also having a null column.

The ENUM field takes the idea of normalizing the data and eliminates the need for a join on that data. It also makes data integrity easy — if the value you’re trying to enter is not in the ENUM column definition, MySQL throws an error.

ENUM is not a standard SQL data type. It is MySQL specific.

As an example, in the real world I run a database for an international not-for-profit. …

[Read more]
Smart code

So, the other day I was asked by a developer to come up with a table to tally votes. Basically, there are 6 choices, and customers can vote once per day. I asked if there were any other constraints, and there were none. I specifically asked if they wanted once per calendar date, or ‘it has to be 24 hours since the last vote’; they wanted ‘once per calendar date’. And updating the current day’s vote is not allowed. Once the vote is in, it cannot be changed.

So I came up with a simple table:

CREATE TABLE `ManOfMonth` (
`uid` int(10) unsigned NOT NULL default ‘0′,
`voteDate` date NOT NULL default ‘0000-00-00′,
`uidVoteFor` int(10) unsigned NOT NULL default ‘0′,
PRIMARY KEY (`uid`,`voteDate`),
KEY `countVotes` (`uidVoteFor`)
)

There’s no need for a timestamp, and you can select now() into the voteDate part, and MySQL truncates the data for you (we’re using …

[Read more]
Showing entries 41 to 49
« 10 Newer Entries