As I indicated in my previous post on MySQL performance, we have
been doing some performance work using an internally developed
web2.0 application. Akara and I will be presenting this app
publicly to a large audience for the first time at the upcoming
Velocity Conference in Burlingame, CA on June
23, 24. Check out our abstract. Most of our work uses Cool
Stack so a lot of the results we will be presenting will be based
on that. If you're struggling with performance issues, this
conference may be worth checking out.
If you will be attending the conference, please stop by and say
hello. It's always good to see people whom we only know through
blogs and forums.
First, let's look at the numbers. The table below lists the speed of building a timeline like Twitter does, all of them using pull model.
| timelines / sec. | |
|---|---|
| SQL | 56.7 |
| Stored Procedure | 136 |
| UDF using Direct Access | 1,710 |
As I explained in my previous post (Implementing Timeline in Web Services - Paradigms and Techniques, it is difficult (if not impossible) to write an optimal SQL query to build timelines on the fly. Yesterday I asked on the MySQL Internals mailing …
[Read more]MySQL Quickpolls might be insightful for people who develop products and services for MySQL. Recently I was looking again at “How do you backup your production database” poll. To interpret the results, I wanted to know who are the people answering that and other Quickpolls. Are they the DBAs responsible for running MySQL in production or the developers writing applications that use MySQL? For a backup guy like me knowing that makes a difference.
Every Quickpoll gets a time stamp when opened and tells how many people answered the poll. It occurred to me that the normalized number of people (MySQL polls run for different periods of time) answering each poll could give me some insight. The graph below shows the daily number of people answering each poll in the last 24 months.
Of course, I understand there could be self-selection …
I spent this past weekend writing a Paper for a project I’ve been playing with. It is a simplified distributed processing system loosely based on Google’s MapReduce, except rather than focusing on larger batch jobs, it prototypes out some common database application uses. The model is currently very basic, but I plan on exploring this further (possibly with a performance-enhanced implementation in C). I’ve also been reading up on other interesting projects like Hadoop, HyperTable, Amazon’s SimpleDB, and of course the DB interface for Google’s AppEngine. I’m wondering how these …
[Read more]
First some of the things that you need to use and
understand
Explain Syntax
Order by Optimization
Group by Optimization
Update: Updated errors.
Now some details that are usually missed. GROUP BY does sorting
unless you tell mysql not to. GROUP BY has two optimization
methods, loose index scan, and tight index scan.
Loose index scan, scans the entire table index, while tight index
scan uses some sort of constraint. For large datasets that are
accessed often and require some sort of group by, tight index
scans are better.
So how to pick columns to create …
As Paul points out, this new erlrc project is very exciting news. One of the most interesting features of Erlang is how you can do hot code updating, and getting integrated into the package manager is absolutely wonderful. Anyone working on getting this into Ubuntu yet? There is a very nice howto written about how to set up your Erlang app with this. I’m looking forward to setting this up on my mini-cluster of slicehost nodes.
Simple auditing, i.e., knowing what changed recently, can save you tons of time while troubleshooting.
I know that, in the ideal world:
- Everything is supposed to be done through configuration management.
- Everything is documented and all changes are tracked through a VCS.
- Every DDL or set global is trapped via MySQL Proxy and logged.
But there are always ways to bypass the gatekeepers. Changes can go in unnoticed. An hour or so later, your database performance suddenly changes for the worse, and you get that phone call.
First you check if anything caused an actual error. You look around at a few log files and nothing shows up. The next thing you ask yourself is, did someone change anything in the last little while. Of course, everybody says no. After a few hours of digging, comparing schemas, diff-ing old and current config files, you actually find what has changed, …
[Read more]On Thursday, June 12, at 15:00 CEST (14:00 GMT), there will be a MySQL University Lesson on MySQL Sandbox, a tool to install one or more side server in a few seconds.
To attend the lesson, follow the instructions for attendees and download the recommended material.