Showing entries 1 to 7
Displaying posts with tag: Scaleout and Tuning (reset)
How large can a MySQL database become?

In Maximum MySQL Database Size? Nick Duncan wants to find out what the maximum size of his MySQL database can possibly be. He answers that with a list of maximum file sizes per file system type. That is not a useful answer.

While every file system does have a maximum file size, this limitation is usually not relevant when it comes to MySQL maximum database size. But let's start with file systems, anyway.

First: You never want to run a database system on a FAT filesystem, ever. In FAT, a file is a linked list of blocks in the FAT. That is, certain "seek" (backwards seek operations) operations become slower the larger a file is, because the file system has to position the file pointer by traversing the linked list of blocks in the FAT. Since seek operations are basically what a large database does all day, FAT is …

[Read more]
DELETE, innodb_max_purge_lag and a case for PARTITIONS

Where I work, Merlin is an important tool for us and provides a lot of insight that other, more generic monitoring tools do not provide. We love it, and in fact love it such much that we have about 140 database agents reporting into Merlin 2.0 from about 120 different machines. That results in a data influx of about 1.2G a day without using QUAN, and in a data influx of about 6G a day using QUAN on a set of selected machines.

It completely overwhelms the Merlin data purge process, so the merlin database grows out of bounds, which is quite unfortunate because our disk space is in fact very bounded.

The immediate answer to our purge problem was to disable the merlin internal purge and with the kind help of MySQL support to create a script which generates a list of record ids to delete. These ids end up in a number of delete statements with very large WHERE ... IN (...) clauses that do the actual delete.

This …

[Read more]
Seven times faster commit speed in Windows?

According to my findings in Bug #31876, MySQL does not commit data to disk in Windows using the same method MS SQL Server and DB/2 are using. The method MySQL uses appears to be seven times slower in pathological scenarios.

The bug report contains a patch - thanks to the MySQL WTF (The Windows Task Force) and the lab provided by the customer for helping me to find that.

Does this work for you? I want to hear about your test results.

Rubyisms

Lately, I have had opportunity to evaluate a very large Ruby installation that also was growing very quickly. A lot of the work performed on site has been specific to the site, but other observations are true for the platform no matter what is being done on it. This article is about Ruby On Rails and its interaction with MySQL in general.

Continue reading "Rubyisms"

Changing everything

This article does not even contain the words database or MySQL. I still believe it is somewhat interesting.

Mail has, for some reason, always been playing a big role in my life. I have been running mail for two, my girlfriend and me, in 1988. I have been running mail for 20 and 200 people in 1992, setting up a citizens network. Later I designed and built mail systems for 2 000 and 20 000 person corporations, and planned mail server clusters for 200 000 and 2 million users. And just before I became a consultant at MySQL I was working for a shop that did mail for a living for 20 million users.

Mail is a very simple and well defined collection of services. You accept incoming messages to local users, you implement relaying for your local users with POP-before-SMTP and SMTP AUTH, you build POP, IMAP and webmail accesses, and you deploy spam filter systems and virus scanners for incoming and outgoing messages. This services …

[Read more]
Statification

In Semi-Dynamic Data, Sheeri writes about Semi-Dynamic Data and content pregeneration. In her article, she suggests that for rarely changing data it is often adviseable to precompute the result pages and store them as static content. Sheeri is right: Nothing beats static content, not for speed and neither for reliability. But pregenerated pages can be a waste of system ressources when the number of possible pages is very large, or if most of the pregenerated pages are never hit.

An intermediate scenario may be a statification system and some clever caching logic.

Statification is the process of putting your content generation code into a 404 page handler and have that handler generate requested content. The idea is that on a …

[Read more]
Dealing with failure - the key to scaleout

Scaling Patterns This is a translation of a german language article I wrote two weeks ago for my german language blog.

In 2004, when I was still working for web.de, I gave a little talk on Scaleout on Linuxtag. Even back then one major message of the talk was "Every read problem is a cache problem" and "Every write problem is a problem of distribution and batching":

To scale, you have to partition your application into smaller subsystems and …

[Read more]
Showing entries 1 to 7