Sysbench has three distribution for random numbers: uniform,
special and gaussian. I mostly use uniform and special, and I
feel that both do not fully reflect my needs when I run
benchmarks. Uniform is stupidly simple: for a table with 1 mln
rows, each row gets equal amount of hits. This barely reflects
real system, it also does not allow effectively test caching
solution, each row can be equally put into cache or removed.
That’s why there is special distribution, which is better, but to
other extreme – it is skewed to very small percentage of rows,
which makes this distribution good to test cache, but it is hard
to emulate high IO load.
That’s why I was looking for alternatives, and Zipfian
distribution seems decent one. This distribution has a
parameter θ (theta), which defines how skewed the distribution
is. A physical sense of this parameter, if to apply …
In my previous post with results for Fusion-io ioDrive we saw some instability
in results, I was pointed that it may be fixed in new drivers VSL
3.1.1. I am not sure if this driver is available for everyone –
if you are interested, please contact your Fusion-io support
representative. I installed new drivers and firmware, and in
fact, the result improved.
Information about driver and firmware: Firmware v6.0.0, rev
107006. Fusion-io driver version: 3.1.1 build 172.
Actually an upgrade was not flawless, after a firmware upgrade I had to perform low-level formatting, which erase all data. So if you want to do the same – make sure you copy your data.
So there are results for driver 3.1 (with comparison to previous driver 2.3)
Random writes:
…
I still continue to run benchmarks of different SSD cards. This
time I show numbers for Virident FlashMAX 1400. This is a MLC
PCIe SSD device. There are couple notes on these results.
First, this time I use a different server. For this benchmark it
is Cisco UCS C250, while for previous results I
used HP ProLiant DL380 G6.
Second note is, that I use a mode “turbo=1″ for Virident card.
What does that mean? Apparently PCIe specification has a
limitation on available power. If I am not mistaken it is
25W, however Virident to provide full write performance
requires 28W. And while many servers can handle 28W on
PCIe, this is a non-standard mode, and Virident by default uses
25W (turbo=0). To force full power, I load a driver with
turbo=1. I also use “maxperformance” formatting for
Virident, …
Following my series of posts on testing different SSD, in
my last post I mentioned that SATA SSD
performance is getting closer to PCIe cards. It really makes
sense to test it under MySQL workload, but before getting to
that, let me review the same workload on Fusion-io ioDrive PCIe
card. This is yet previous generation of Fusion-io cards, but
this is the one that has biggest installation base.
Driver information: Fusion-io driver version: 2.3.10 build 110;
Firmware v5.0.7, rev 107053
Following the format of previous benchmarks, first is random write async 16KB case.
We can see some wave-like pattern with throughput 350-400 MiB/sec.
Random reads, async:
…
[Read more]
Following my previous benchmarks of SATA SSD cards I got
Intel SSD 520 240GB into my hands. In this
post I show the results of raw IO performance of this card.
The benchmark methodology I described in previous posts, so let me jump directly to results.
First case is random write asynchronous 8 threads IO,
the test is done just after a secure erase operation on the
card.
The card is doing stable 380 …
[Read more]
Following my previous benchmark of Samsung 830, today I want to show results for
STEC MACH16 SATA card, 200GB size, this card is based on SLC, and
regarding STEC website, it is an enterprise grade storage.
For tests I use sysbench fileio, 16KiB block size (to match workload from InnoDB, as this is primary usage for me), and recently I switched to use async IO mode. There are two reasons for that. First, MySQL/InnoDB uses async writes, so this will emulate database load, and second, async mode allows to see maximal possible throughput, it does not show reliable latency though, as it appears there is no a reliable way in the Linux asynchronous IO library to get time metrics for particular IO …
[Read more]
I personally like PCIe based Flash, but from a pricing point our
customers are looking for cheaper alternatives. SATA SSD is an
options. There is many products based on MLC technology, and
Intel 320 I would say is the most popular. I do not particularly
like its write performance – I wrote about it before, that’s why I am looking for
comparable alternatives. Samsung 830 256GB looked like a good
product, that’s why I decided to test it.
For tests I use sysbench fileio, 16KiB block size (to match workload from InnoDB, as this is primary usage for me), and recently I switched to use async IO mode. There are two reasons for that. First, MySQL/InnoDB uses async writes, so this will emulate database load, and second, async mode allows to see maximal possible throughput, it does not show …
[Read more]
We are running internally a lot of benchmarks on our recently
announced Percona XtraDB Cluster, and I am going to
publish these results soon.
But before that I wanted to mention that proper benchmark of
distributed system comes with a lot of challenges.
I am saying that not to complain, but to make sure, if you are
going to benchmark XtraDB Cluster yourself, there is a lot of
things to take into account.
And it seems that one component, which was not much important
before, now appears as critical peace, which easily can became
bottleneck in the benchmarks – this is network.
In case of simple client-server setup, the network is not fully
utilized.
But as …
[Read more]This is the third blog post in the series of blog posts leading up to the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. This blog post is targeted at the join related optimizations introduced in the optimizer. These optimizations are available in both MySQL 5.6 and MariaDB 5.5, and MariaDB 5.5 has introduced some additional optimizations which we will also look at, in this post.
Now let me briefly explain these optimizations.
Batched Key Access
Traditionally, MySQL always uses Nested Loop Join to join two or more tables. What this means is that, select rows from first table participating in the joins are read, and then for each of these rows an index lookup is performed on the second table. This means many point queries, say for example if table1 yields 1000 …
[Read more]MySQL version 4.1 was quite revolutionary. The main reason for that was support for sub-queries.1
However since then MySQL users were rather discouraged to use that functionality, basically due to the implementation’s poor performance and forced to build complicated queries based on joins rather than on subqueries.
Of course you can do some effort to optimize your subquery with sometimes very good results2. Not always it’s easy or even possible if you can’t change the code though.
You’d say it’s not a problem for typical OLTP, web based traffic at all, just don’t use subqueries! That’s true, …
[Read more]