We've begun writing the second edition of the now-classic High Performance MySQL. "We" means co-authors Arjen Lentz, Baron Schwartz, Vadim Tkachenko, and Peter Zaitzev. O'Reilly is still the publisher, and Andy Oram is still the editor. With a team like this, I think the second edition will be a book you don't want to miss. Though in theory we're revising the first edition, the truth is we're starting from scratch and re-writing the book, and significantly expanding it at the same time. A lot has changed since Jeremy and Derek wrote the first edition. Today's MySQL deployments push the limits further than many people thought possible a few years ago. We'll teach you how they do it.
« 10 Newer Entries
On the MySQL Conference & Expo 2007, I had the chance of meeting up with Paul (the author of PBXT) and Mikael. We briefly touched the topic of the BLOB Streaming Protocol that Paul is working on, which I find really neat. On the way back home, I traveled with Anders Karlsson (one of MySQL:s Sales Engineers), who is responsible for the BLOB Locator worklog and he described the concepts from his viewpoint.
Since I work with replication, these things got me thinking on what the impact is for replication and how it affects usability, efficiency, and scale-out. Being a RESTful guy, I started thinking about URIs both when …[Read more]
In the first two articles in this series, I discussed archiving basics, relationships and dependencies, and specific archiving techniques for online transaction processing (OLTP) database servers. This article covers how to move the data from the OLTP source to the archive destination, what the archive destination might look like, and how to un-archive data. If you can un-archive easily and reliably, a whole new world of possibilities opens up.
In the first article in this series on archiving strategies for online transaction processing (OLTP) database servers, I covered some basics: why to archive, and what to consider when gathering requirements for the archived data itself. This article is more technical. I want to help you understand how to choose which rows are archivable, and how to deal with complex data relationships and dependencies. In that context, I'll also discuss a few concrete archiving strategies, their strengths and shortcomings, and how they can satisfy your requirements, especially requirements for data consistency, which as you will see is one of the most difficult problems in archiving.
In May 2005, I wrote a widely-referenced article about how to efficiently archive and/or purge data from online transaction processing (OLTP) database servers. That article focused on how to write efficient archiving SQL. In this article I'll discuss archiving strategy, not tactics. OLTP servers tend to have complex schemas, which makes it important and sometimes difficult to design a good archiving strategy.
From the keynote presentation this past Tuesday:
I have an audio recording of the talk as well but it is in poor shape. If I can clean it up I will post it as well.
In the previous article I discussed using Read Replication
Clustering to scale out reads for a website. What I will now do
is describe a refined approach to the problem of scaling by
creating "Application Clusters with Replication".
A common approach to website design is that a web designer creates a website and decides that search is a feature that they want to implement. If they use the MyISAM engine this means that they can add fulltext indexes to their tables and then make use of them in queries. I will ignore the case where the developer decides that an unanchored LIKE clause is an appropriate solution, since this developer will quickly hit a wall on performance and will need to learn what a fulltext index is.
So the developer adds a fulltext and is good to go? Sounds like an easy solution?
If the site the developer has written begins to see significant traffic then one of three things will occur. …
If there is a common method of scaling with MySQL databases it is
Read Replication Cluster solution.
Most websites start out with a single database and grow from there.
If the site's content is being generated from their database then
they will eventually hit a wall with reads from the database. Tuning
and hardware will buy you some growth but in the end disks spin only
so quickly. Luckily most websites are predominantly read intensive
and for this reason replication will solve scaling problems for many
people. Replication is a means by which MySQL sends updates of one
database to one or more databases which will act as a slave. These
changes are atomic, which means the changes are applied in full. No
row will ever be partially updated, and no transaction will be seen
on the slave that did not commit on the master
Make a change in the …
I've never been all that interested in solving small problems.
Small problems with scaling are resolved with single indexes,
upgrades to hardware, or simply creating a bigger pipe.
When the measure of the Internet was a T-1, you could flood the network with the average 486. At the time I watched people buy hardware in the hundred's of thousands, and sometimes more, which never went used. Today's hardware is overkill for a lot of applications, so the first step in scaling is often tuning the hardware that you have already purchased. Make use of what you already have.
The "Slashdot Effect" is a perfect example of what is normally a small problem. What is the Slashdot Effect? Point tens of thousands of eyeballs at a website and watch it crash. The root cause of this? Most of the time it is because the site operator had their Apache max connections set to some ridiculous number. Users would bring the site down because there …
« 10 Newer Entries