If you have two or more database servers containing the same
data, how do you know if the objects are identical. Furthermore,
how can you be sure the data is the same on all of the servers?
What is needed is a way to determine if the databases are in
synch - all objects are present, the object definitions are the
same, and the tables contain the same data. Synchronizing data
can become a nightmare without the proper tools to quickly
identify differences among objects and data in two databases.
Perhaps a worst case (and more daunting) is trying find data that
you suspect may be different but you don’t have any way of
finding out.
This is where the new 'mysqldbcompare' utility comes in handy.
The mysqldbcompare utility uses the mysqldiff functionality
(mysqldiff allows you to find the differences in object
definitions for two objects or a list of objects in two
databases) and permits you to compare the object definitions and
the data …
I’ve just sent an email to the Maatkit discussion list to announce a planned change to how Maatkit (and Aspersa) are developed. In short, Percona plans to create a Percona Toolkit of MySQL-related utilities, as a fork of Maatkit and Aspersa. I’m very happy about this change, and I welcome your responses to that thread on the discussion list.
Related posts:
[Read more]I’ll be speaking at Surge again this year. This time, unlike last year’s talk, I’m tackling a very concrete topic: extracting scalability and performance metrics from TCP network traffic. It turns out that most things that communicate over TCP can be analyzed very elegantly just by capturing arrival and departure timestamps of packets, nothing more. I’ll show examples where different views on the same data pull out completely different insights about the application, even though we have no information about the application itself (okay, I actually know that it’s a MySQL database, and a lot about the actual database and workload, but I don’t need that in order to do what I’ll show you). It’s an amazingly powerful technique that I continue to find new ways to apply to real systems.
Take a …
[Read more]It’s notoriously hard to measure the usage of open-source software. Software that’s open-source or free can be redistributed far and wide, so the original creators have no idea how many times it’s installed, deployed, or distributed. As a proxy, we often use downloads, but that’s woefully inadequate.
I’ve recently begun trying to figure out how many job openings are mentioning various open-source projects. I think that this might be a better metric because it’s driven by the end result (usage), rather than intermediate processes (downloads, etc). I think that it’s likely that usage and demand for skilled people is somewhat realistically related.
To be more concrete, I’ve been watching RSS feeds from job posting aggregators for several alternative versions of MySQL: Percona Server, MariaDB, and Drizzle. It appears that Percona Server is by far the most in-demand in terms of job skills. (I haven’t seen a job posting …
[Read more]SQL Antipatterns, by Bill Karwin, Pragmatic Bookshelf, 2010. About 300 pages. Here’s a link to the publisher’s site.
I loved this book. (Disclosure: Bill is a colleague of mine.) This is the first book I’ve read from the Pragmatic Bookshelf, and if the rest are like this one, I want to read them. The quality of the writing is way above the average technical book. Techniques that feel gimmicky and forced in other books, such as fake stories to introduce each chapter, actually work here (because they’re real stories, not fake ones). Each topic is named in a memorable way and introduced very …
[Read more]I was at a conference recently talking with a Major Cloud Hosting Provider and mentioned that for database servers, I really want large instances, quite a bit larger than the largest I can get now. The lack of cloud servers with lots of memory, many fast cores, and fast I/O and network performance leads to premature sharding, which is costly. A large number of applications can currently run on a single real server, but would require sharding to run in any of the popular cloud providers’ environments. And many of those applications aren’t growing rapidly, so by the time they outgrow today’s hardware we can pretty much count on simply upgrading and staying on a single machine.
The person I was talking to actually seemed to become angry at me, and basically called me an idiot. This person’s opinion is that no one should be running on anything larger than 4GB of memory, and anyone who doesn’t build their system to be sharded …
[Read more]MySQL Replication by Russell Dyer, Silent Killdeer, 2010. About 180 pages.
This is a pocket-sized guide to setting up and managing MySQL replication. It is self-published and made via print-on-demand technology. Topics include how replication works, setting up replication, making backups, and administering replication after it’s working. There are several appendixes for replication-related functionality in the MySQL server and command-line tools.
This book doesn’t go into great depth, so don’t expect it to be a reference manual to replication internals or anything like that. It’s more of a how-to manual for beginners, walking through the basics of binary and relay logs, SQL and I/O …
[Read more]The most basic and most oft-repeated task that a DBA has to accomplish is to look at slow logs and filter out queries that are suboptimal, that consume lots of unnecessary resources and that hence slow down the database server. This post looks at why and how VIEWs can help against such suboptimal operations.
To follow on to my earlier links to Brendan Gregg’s blog posts on measuring I/O latency, there is a third one discussing DTrace, and then a very detailed response from Mark Leith showing how to do it with the PERFORMANCE_SCHEMA in MySQL 5.5.
Related posts:
- Disk latency versus filesystem latency
- Measuring the popularity of the Percona MySQL build
- Thoughts on the new …
This will just be a short reminder. If you're planning to create multiple-column indexes in MySQL, remember that the order of the columns matters. As the MySQL documentation points out, any "leftmost prefix of the index can be used." This means that the index can be used for any number of columns in a where clause as long as those columns are at the beginning of the index column list and in the same order.