During preparation of Percona-XtraDB template to run in RightScale environment, I noticed that IO performance on EBS volume in EC2 cloud is not quite perfect. So I have spent some time benchmarking volumes. Interesting part with EBS volumes is that you see it as device in your OS, so you can easily make software RAID from several volumes.
So I created 4 volumes ( I used m.large instance), and made:
RAID0 on 2 volumes as:
mdadm -C /dev/md0 --chunk=256 -n 2 -l 0 /dev/sdj
/dev/sdk
RAID0 on 4 volumes as:
mdadm -C /dev/md0 --chunk=256 -n 4 -l 0 /dev/sdj /dev/sdk
/dev/sdl /dev/sdm
RAID5 on 3 volumes as:
mdadm -C /dev/md0 --chunk=256 -n 3 -l 5 /dev/sdj /dev/sdk
/dev/sdl
RAID10 on 4 volumes in two steps:
mdadm -v --create /dev/md0 --chunk=256 --level=raid1
--raid-devices=2 /dev/sdj /dev/sdk
mdadm -v --create /dev/md1 --chunk=256 --level=raid1
--raid-devices=2 /dev/sdm /dev/sdl
and
mdadm -v --create /dev/md2 --chunk=256 --level=raid0
--raid-devices=2 /dev/md0 /dev/md1
And also in Linux you can create tricky RAID10,f2 (you can read what is this here http://www.mythtv.org/wiki/RAID)
mdadm -C /dev/md0 --chunk=256 -n 4 -l 10 -p f2 /dev/sdj
/dev/sdk /dev/sdk /dev/sdm
and also I tested IO on single volume.
I used xfs filesystem mounted with noatime,nobarrier options
and for benchmark I used sysbench fileio modes on 16GB file with next script:
PLAIN TEXT CODE:
- #!/bin/sh
- set -u
- set -x
- set -e
- for size in 256M 16G; do
- for mode in seqwr seqrd rndrd rndwr rndrw; do
- ./sysbench --test=fileio --file-num=1 --file-total-size=$size prepare
- for threads in 1 4 8 16; do
- echo PARAMS $size $mode $threads> sysbench-size-$size-mode-$mode-threads-$threads
- ./sysbench --test=fileio --file-total-size=$size --file-test-mode=$mode\
- --max-time=60 --max-requests=10000000 --num-threads=$threads --init-rng=on \
- --file-num=1 --file-extra-flags=direct --file-fsync-freq=0 run \
- >> sysbench-size-$size-mode-$mode-threads-$threads 2>&1
- done
- ./sysbench --test=fileio --file-total-size=$size cleanup
- done
- done
So tested modes: seqrd (sequential read), seqwr (sequential write), rndrd (random read), rndwr (random write), rndrw (random read-write). And sysbench uses 16KB pagesize to emulate work of InnoDB with 16KB pagesize.
Raw results you may find in Google Docs https://spreadsheets.google.com/ccc?key=0AjsVX7AnrCYwdFlBVW9KWVJGUGFqeVdpUHY0Y0VXYXc&hl=en
, but let me show most interesting results from my point of
view. On graphs I show requests / second (more is better) and
response time in ms for 95% cases (less is better).
What I see from the results is that if you are looking for IO
performance in EC2/EBS environment it's definitely worth to
consider some RAID setup.
RAID5 does not show benefits comparing with others, and RAID10,f2
is worse than RAID10.
But speaking RAID0 vs RAID10 it's your call. For sure in regular
server I'd never suggest RAID0 for database, but speaking about
EBS I am not sure what guarantee Amazon gives here. I'd expect
under EBS volume there already exists redundant array, and it may
not worth to add additional redundancy, but I am not sure in
that.
For now I'd consider RAID10 on 4 - 10 volumes.
And of course to get benefit from multi-threading IO in MySQL you
need to use XtraDB or MySQL 5.4 ®
However there may be small problem with backup over EBS. On single EBS volume you can just do snapshot, but on several volumes it may be tricky. But in this case you may consider LVM snapshots or XtraBackup
Entry posted by Vadim | 6 comments