Mysqlslap is a good benchmarking tool but can
be much more versatile in my opinion. So, recently, I was
benchmarking a problem that was, not hard to solve, but somewhat
tricky to benchmark. Using mysqlslap was the fastest way to get
the answers I wanted with the least amount of BS, or so I
thought!
Lets take a look:
-- The problem
1. A MySQL instance with 5 databases
2. Only one of the databases looked to be the problem from system
analysis
3. Light concurrency issues suspected, given that the tables were
MyISAM and the information gathered from the customer
-- How to run a close to real world benchmark
Turning on the general query log is a must in this situation so I
gathered 1 hour worth of transactions. Please keep in mind that
you have to take a snapshot of the data before you …
After Chris' great post, What's faster than ALTER, I got thinking about
REPAIR. REPAIR TABLE can be slow, but may seem necessary if your
table is corrupt.
A little background:
MyISAM tables are prone to corruption if they are receiving
updates/inserts/deletes/etc and your server crashes. Doing a
REPAIR TABLE can generally "fix" them, but it can take time. The
larger the table, the longer it takes. How long? Well, a
rough approximation that I use is:
Size of table | Approximate time |
KBs | N seconds |
MBs | N minutes |
GBs | N hours |
So a …
So I was watching the discovery channel about the maintenance of
the Golden Gate bridge. I was surprised to find out that it takes
two years to paint the whole thing, then, when they are
"finished" the job starts all over again. Basically, this is a
never ending project! I started to think about the larger,
several thousand server, MySQL installations I've worked on and
how to improved performance. Oddly enough The Project Formally
Known as Golden Gate was born! Thought I would change the name
given the recent threat from Big Brother!
-- The fore thought
Take an environment with several hundred or, in this case,
several thousand servers and figure out a way to increase
performance from 30% - 80% or more.
-- Back to basics
De-fragmentation of the data set is a simple way to make sure
that your throughput at its best. Over time, social networking
sites, or any site that runs heavy writes with …
There have been a lot of systems I have looked at where the
Created_tmp_disk_tables was very high. MySQL has good
documentation on how it uses memory, specifically temp tables,
here.
I would like to talk about not "why" MySQL does this but how to
speed up the performance when MySQL internally or users create
temporary tables to disk. A great solution is TMPFS, a quick how
to is as follows:
-- Before you start
1. Make sure you allocate enough space to TMPFS
-- 2GB is usually safe but if you are using larger data sets with
inefficient queries then there are far worse performance issues
to deal with.
-- The safe way to implement TMPFS for MySQL
shell> mkdir /tmp/mysqltmp
shell> chown mysql:mysql /tmp/mysqltmp
shell> id mysql
##NOTE: make …
Well, the answer I always get is "DROP". Then I would say...
"what if you wanted to preserve your data?" It is very common for
DBAs to think only in terms of what is obviously available in
MySQL, ALTER, than all of the tools that are really there. I have
been placed in situations in many companies where ALTER was just
not the way to go.
Given the amount of MySQL usage throughout the WEB2.0/IT industry
and the fact that data sets seem to keep growing... I have come
up with a great solution to ALTERING large data sets. Please keep
in mind that this is an old feature of MySQL and was a great
"lightning speech" at the Percona conference in
04/08.
-- THE TABLE
mysql> use test;
mysql> show table status like 't1'\\\\G
*************************** 1. row
***************************
Name: t1
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: …
There is thorough documentation about the Partitioning feature in
MySQL 5.1. There are also nice articles like this one by Robin. However, I thought it would
be useful to have a quick "how-to" guide to partitioning by
dates. I'll use the world schema to keep it easy.
Partitioning a table by a range of dates is quite popular.
Unfortunately, the PARTITION BY RANGE only accepts an integer (or
a function that evaluates to an integer) as the partition
expression. That's fine if you want to partition by numeric types
like this:
ALTER TABLE City PARTITION BY RANGE(id) (
PARTITION p0 VALUES LESS THAN (1000),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN (3000),
PARTITION p3 …
Testing Secure Replication
For the first test I used encrypted (SSL) replication and
inserted 200,000 records using three 10 minute disconnection
intervals per hour.
After several hundred thousand of inserts, deletes and updates on
the SOA and RR tables simultaneously over a normally connected
SSL replication channel I have the following results:
RR Update Test
Test 1:
RR Records Updated (Changes Set) = 129145
Elapsed Time = 14147.411601
RR Updates/Second = 9.13
RR Updates/Minute = 547.8
Europe MySQL Slave RR Update Counts (Changes Set) = 129145
Asia MySQL Slave RR Update Counts (Changes Set) = 129145
Test 2:
RR Records Updated (Changes Cleared) = 129145
Elapsed Time = 10168.638867
RR Updates/Second = 12.70
RR Updates/Minute = 762
Europe MySQL Slave RR Update Counts (Changes Cleared) =
129145
…
Testing Non-Secure Replication
A series of experiments were conducted to determine whether MySQL
replication would prove to be reliable with SSL enabled. Please
note that all tests were conducted using the MyDNS schema with
includes the SOA and RR tables on MySQL 5.1.
The first experiment sets focused on replication operations and
not on a predetermined set of Insert, Update or Delete patterns.
So Inserts were used since they are the easiest to tag and
verify. Again, the focus is on replication channel fault
recovery.
Later iterations used simultaneous Inserts, Deletes and Updates
against both SOA and RR tables from production snapshots of the
mydns database, with request sets greater than 100,000 rows each,
over an SSL replication channel.
The first test was done by closing the Slave server's outbound
SQL port to the Master with iptables, the Slave was uaware it was
no longer …
This is an old post I put on the old mysqlhow2.com that ended up
on modphp.org. Both sites are run by people I know so it's coo.
Anyway, I thought it should be here as well.
Lets look at the results of the query without returning all the
results:
NOTE: Use EXPLAIN
- Explain will show the possible rows to be examined in your
query. (The more rows the slower the return)
- Example of EXPLAIN: (we will be using the world database by
mysql.com)
EXPLAIN SELECT * from City where CountryCode =’USA’ and
Population > ‘1000′ order by District;
Results:
+—-+————-+——-+——+—————+——+———+——+——+—————————–+
| id | select_type | table | type | possible_keys | key | key_len
| ref | rows | Extra |
…
MySQL replication is a great feature: it's easy to setup and
relatively easy to manage, it can provide scalabilty and
availability, a master can serve dozens of slaves, etc. However,
sometimes people have the problem that the slaves can't keep up;
they get further and further behind, which is a major problem. So
this post talks about why that happens (sometimes) and what you
can do about it.
Causes:
- The most common reason that a slave cannot keep up with the master is the SQL Thread. There are 2 threads on the slave that handle replication: the IO Thread which connects to the master and pulls down the binary log events, and the SQL Thread which executes these events on the slave. But the master might have dozens of concurrent threads executing inserts/updates/deletes. So when they get to the slave, they are run in a single thread which may not be able to do the same throughput. There's a good reason for the …