There have been a lot of systems I have looked at where the
Created_tmp_disk_tables was very high. MySQL has good
documentation on how it uses memory, specifically temp tables,
here.
I would like to talk about not "why" MySQL does this but how to
speed up the performance when MySQL internally or users create
temporary tables to disk. A great solution is TMPFS, a quick how
to is as follows:
-- Before you start
1. Make sure you allocate enough space to TMPFS
-- 2GB is usually safe but if you are using larger data sets with
inefficient queries then there are far worse performance issues
to deal with.
-- The safe way to implement TMPFS for MySQL
shell> mkdir /tmp/mysqltmp
shell> chown mysql:mysql /tmp/mysqltmp
shell> id mysql
##NOTE: make …
I saw Mark Callaghan’s post, and his graph showing miss rate as a function of cache size for InnoDB running MySQL. He plots miss rate against cache size and compares it to two simple models:
- A linear model where the miss rate is (1-C/D)/50, and
- A inverse-proportional model where the miss rate is D/(1000C).
He seemed happy (and maybe surprised) that that the linear model is a bad match and that inverse-proportional model is a good match. The linear model is the one that would make sense if every page were equally likely to have a hit.
I’ll argue here that it’s not so surprising. Suppose that miss rate has a heavy-tailed distribution, such as Zipf’s law. An example of a Zipf’s-law distribution would be if …
[Read more]I saw Mark Callaghan’s post, and his graph showing miss rate as a function of cache size for InnoDB running MySQL. He plots miss rate against cache size and compares it to two simple models:
-
- A linear model where the miss rate is
(1-C/D)/50, and
- A inverse-proportional model where the miss rate is D/(1000C).
He seemed happy (and maybe surprised) that that the linear model is a bad match and that inverse-proportional model is a good match. The linear model is the one that would make sense if every page were equally likely to have a hit.
I’ll argue here that it’s not so surprising. Suppose that miss rate has a heavy-tailed distribution, such as Zipf’s law. An …
[Read more]There’s a really important difference between a unique index (MySQL’s answer to a “unique constraint”) and a primary key in MySQL. Please take a look at this:
CREATE TABLE `t` (
`a` int,
`b` int,
`c` int,
UNIQUE KEY `a` (`a`,`b`)
)
The combination of columns a, b should uniquely
identify any tuple in the table, right?
select * from t;
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 2 | 3 |
| NULL | NULL | 1 |
| NULL | NULL | 1 |
| NULL | NULL | 1 |
+------+------+------+
Wrong. Our arch-enemy NULL messes things up again:
A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. This constraint does …
[Read more]|
|
On Friday afternoon, I went to give a presentation about MySQL advanced features at the Sardegna Ricerche technology park. The presentation included a quick introduction to MySQL Sandbox, something that I have been doing for years, and I thought I could do blindfold, if required. However, something didn't go as expected. |
Just when I was showing off how easy is it to install a MySQL sandbox from a tarball, I was faced by an unexpected error. The tar application was not among the recognized ones. As soon as I saw the error, I immediately knew what had …
[Read more]I have pushed more performance monitoring changes to the Facebook patch on Launchpad. This includes:
- Support for SHOW INNODB FILE STATUS that reports IO statistics per InnoDB tablespace. When innodb_file_per_table is used, this provides statistics per table.
- Counters in SHOW STATUS that report time processing SQL statements including Command_seconds for non-replication SQL, Command_slave_seconds for replication SQL and Innodb_record_in_range_seconds that measures time in InnoDB during query optimization.
- Changes to ignore enforcment of innodb_thread_concurrency for the replication SQL thread. This should reduce delay but more remains to be done. When a slave is busy with many more concurrent queries than CPU cores, …
In a previous post, I outlined a plan to port the
Build-It-Blocks web site from MS SQL to
MySQL. I plan to run the Windows stack (Windows, .asp, and
IIS) on MySQL via the MyODBC connector. Being a Red Sox
fan, this feels like getting David Ortiz to hit with A-Rod's bat,
but I digress. So far, things are going smoothly:
1. Install MySQL and MyODBC
I chose to install MySQL from the zip file, primarily because I
wanted the ability to leave the Windows Server in a known good
state if something went wrong; by doing a manual install I knew I
would be able to "un-install" everything if necessary.
Also, all of my experience with MySQL has been on Linux, so I'm
comfortable with the …
I think I have mentioned Picok before on my blog. This is a system that lets users arrange and configure a number of portlets in order to be able to keep themselves up to date with what's going on in various web applications, similar to iGoogle and netvibes. The difference is of course that this system is entirely open source, so companies can install Picok in their intranet and give Picok direct access to all sorts of internal applications that they could not make available to iGoogle or netvibes. What makes this project all the more exciting is that this system was initially developed as a closed source application for the Raiffeisen bank in Switzerland. This was the first time for me, where I had the opportunity to be part of open sourcing such a large chunk of code.
One of the things we open source proponents mention as a plus point for open source is that developers do not easily get away …
[Read more]I often see people confuse different ways MySQL can use indexing, getting wrong ideas on what query performance they should expect. There are 3 main ways how MySQL can use the indexes for query execution, which are not mutually exclusive, in fact some queries will use indexes for all 3 purposes listed here.
Using index to find rows The main purpose of the index is to find rows quickly - without scanning whole data set. This is most typical reason index gets added on the first place. Most popular index type in MySQL - BTREE can speed up equality and prefix range matches. So if you have index on (A,B) This index can be used to lookup rows for WHERE clauses like A=5 ; A BETWEEN 5 AND 10 ; A=5 AND B BETWEEN 5 AND 10 it however will NOT be able to help lookup rows for B BETWEEN 5 AND 10 predicate because it is not index prefix. …
[Read more]
In my recent post on the EU antitrust regulators'
probe into the Oracle Sun merger I did not mention an important
class of stakeholders: the MySQL-based special purpose database
startups. By these I mean:
I think it's safe to say the first three are comparable in the
sense that they are all analytical databases: they are designed
for data warehousing and business intelligence applications.
ScaleDB might be a good fit for those …