One of my colleagues, Ryan Lowe, has just heard that his session on PCI compliance with MySQL has been accepted at the upcoming MySQL conference. Ryan is highly qualified to present this topic, and not many people can say that; I certainly can’t claim that title myself. If you’re looking to learn how to make your MySQL installation PCI-compliant, there’s also not a lot of trustworthy information online. Personally — and really, no bias just because he’s my colleague — I think this is a great session for the MySQL conference, which I sometimes thought didn’t have enough diversity of topics in past years. We need more stuff like this to give people a reason to return after they’ve gone for 2 or 3 years in a …
[Read more]I’ll be helping Morgan Tocker deliver the second half of his training course for MySQL Developers/DBAs in New York City in a few days (more Percona training). It was a snap decision at the last minute, but I’m hoping I’ll still get to meet some folks there. If we’ve corresponded over email or blog comments and you would like to get together, ping me in the comments here!
If you’re in the New York City area and you use MySQL, you should consider attending this course, too. Morgan knows his stuff and has written a good curriculum. Attendees give his courses excellent feedback, and the price is very reasonable. Oh, and I’ll be there too, did I mention that? You can pick my …
[Read more]The headline is flame-bait, don’t take it. I just wanted to point something out about character sets and collations in MySQL.
To the uninitiated, it may seem overwhelming. Everything has a character set! Everything has a collation! And they act weirdly! The server has one. The database has one (oh, and it changes magically as I USE different databases.) Every table has one, and columns too. Is that all? NO! My connection has one! Kill me now!
Relax. In truth, only one kind of thing actually has a charset/collation. That is values. And values are stored in columns. The only thing that really has a charset/collation is a column.[1]
What about all the rest of those things — connection, database, server, table? Those are just defaults, which determine what charset/collation a value gets if it isn’t overridden. So if the table’s default charset is utf8, and you add a column without saying what …
[Read more]Cary Millsap has a concise, readable paper on performance. Anyone involved in database performance optimization should read it. Cary’s writing has heavily influenced the mk-query-digest tool for analyzing MySQL/PostgreSQL/Memcached/HTTP query performance, and I think you’ll get a lot more from mk-query-digest if you read this paper — and you should also read his book, reviewed here. It’s one of the top books on my Essential Books List.
Related posts:
- …
I’m speaking at the O’Reilly MySQL Conference 2010. I hope I don’t lose my voice, because I have four sessions!
- Diagnosing and Fixing MySQL Performance Problems
- EXPLAIN Demystified
- Read-Write Splitting: Techniques, Challenges, and Solutions
- MySQL Graphing and Trending with Cacti
You can click through on the links above to learn …
[Read more]I’ll be attending PgEast this year, as I’ve done for the last couple of years, and this year I’ll also be speaking. The topic is query analysis with mk-query-digest. The official description of my talk is as follows:
mk-query-digest is a powerful open-source tool for capturing, filtering, transforming, and aggregating queries, with the ability to do all sorts of other advanced tasks too. By default, it aggregates similar queries together and presents a designed-for-DBAs report with statistics about the most important queries, so you can see where to focus your optimization efforts. This talk shows you how to use mk-query-digest to analyze your Postgres server’s workload.
…
[Read more]Maatkit does more than just MySQL. I’ve just committed a new version of mk-query-digest, a powerful log analysis tool, with support for Posgtres logs, in both syslog and stderr format. I’m hoping that people will give this a spin in the real world. I have lots of test cases, but that’s never enough; I’m looking for people to crunch their logs and let me know if anything breaks.
A brief tutorial:
# Get it
$ wget http://www.maatkit.org/trunk/mk-query-digest
# Run it
$ perl mk-query-digest --type pglog /path/to/log/file
# Learn about it (search for the string "pglog")
$ perldoc mk-query-digest
I’m going to close comments on this blog post so I don’t get bug reports in the comments. If you have feedback, please post it to the Maatkit mailing list, or …
[Read more]Understanding MySQL Internals
Understanding MySQL Internals. By Sasha Pachev, O’Reilly 2007. Page count: about 227 pages. (Here’s a link to the publisher’s site).
I should have read this book a long time ago, and it’s my loss that I didn’t. Although the title makes it sound like it should only benefit those who’ll be changing the MySQL server’s own code, that’s not true. To the contrary, at least parts of this book should be required reading for DBAs and developers who use MySQL, after they gain a moderate level of familiarity with how to use the server.
The book does indeed start off …
[Read more]I explored two interesting topics today while learning more about Postgres.
Partial page writes
PostgreSQL’s partial page write protection is configured by the following setting, which defaults to “on”:
full_page_writes (boolean)
When this parameter is on, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint… Storing the full page image guarantees that the page can be correctly restored, but at a price in increasing the amount of data that must be written to WAL. (Because WAL replay always starts from a checkpoint, it is sufficient to do this during the first change of each page after a checkpoint. Therefore, one way to reduce the cost of full-page writes is to increase the checkpoint interval parameters.)
Trying to reduce the cost of full-page writes by increasing the checkpoint interval highlights a compromise. …
[Read more]Many times I’ve heard people advise on “best practices” for a MySQL database. This often includes routine maintenance, such as “you should run OPTIMIZE TABLE on all of your InnoDB tables once a week to defragment them for better performance.”
But this advice is unsubstantiated and could even be detrimental. Here are some of the obvious problems that I can think of:
- The optimized table compacts the primary key (clustered index) to its default 15/16ths fill factor per page. But other indexes will be built in pseudo-random order and are likely to end up just as fragmented afterwards as before. Which indexes are more important for performance? Maybe the primary key is just a dummy value that’s not even used, and the secondary indexes are the ones that would benefit from compacting.
- Suppose the primary key is the important one, and SELECT queries will perform more quickly if it’s defragmented. Why does it …