Leaving Sun/MySQL but not the community

I thought I would put out a quick note about this right now, I am going to be leaving Sun next week. I do have some plans which will start to materialize over the next few weeks and months so stick around for that. Even though I am leaving the company, I will still be lurking and working on all think database and performance related. This should not have any bearing on or Waffle Grid, hopefully this will only mean I will post more and spend more time doing crazy and cool things!

I wanted to thank everyone at MySQL who I worked with and learned from over the past few years.

WaffleGrid: Cream Benchmarks, stable and delivering a 3x boost

Lets get down to how the latest version of Waffle Grid performs.

Starting off simple lets look at the difference between the wafflegrid modes. As mentioned before the LRU mode is the “classic” Waffle Grid setup. A page is put into memcached when the page is removed from the buffer pool via the LRU process. When a page is retrieved from memcached it is expired so its no longer valid. In the New “Non-LRU” mode when a page is read from disk, the page is placed in memcached. When a dirty page is flushed to disk, this page is overwritten in memcached. So how do the different modes perform?

4GB Memcached, Read Ahead Enabled TPM % Increase
No Waffle 3245.79 Baseline
Waffle LRU 10731.34
WaffleGrid: 0.5 Cream Release

I wanted to let everyone know that we are releasing Waffle Grid 0.5 code named Cream. This release fixes the nasty secondary index bug that plagued the butter release. I have been running tests on this code base for about a week straight with no errors. While I think this release is much more stable I would remind everyone this is still not a fully GA release. This release includes the ability to choose the mode of Waffle grid. By setting innodb_memcached_enable to 1, we will push pages to memcached when a disk read is done or when a page write is done, setting this to 2 will enable the classic LRU mode. If you decide to set this to 1 ( non-lru) I would recommend using the standard memcached, as with previous versions the LRU mode works better with our slightly altered memcached ( expire from memcached on get ). I will be posting benchmarks and more details within the next couple of days. Right now you can grab the patch on …

Be Wary of Large Log File Sizes

As some people have mentioned here and here Increasing the innodb log file size can lead to nice increases in performance. This is a trick we often deploy with clients so their is not anything really new here. However their is a caveat, please be aware their is a potentially huge downside to having large log file sizes and that’s crash recovery time. You trade real-world performance for crash recovery time. When your expecting your shiny Heartbeat-DRBD setup to fail-over in under a minute this can be disastrous! In fact I have been some places were recovery time is in the hours. Just keep this in mind before you change your settings.

Waffle: Progress and a Rearchtecture?

So I spent several hours over the last few days on the Secondary index bug. Out of frustration I decided to try and bypass the LRU concept all together and try going to a true secondary page cache. In standard Waffle a page is written to memcached only when it is expunged ( or LRU’d ) from the main buffer pool. This means anything in the BP should not be in memcached. Obviously with this approach we missed something, as Heikii pointed out in a comment to a previous post, it seems likely we are getting an old version of a page. Logically this could happen if we do not correctly expire a page on get or we bypass a push/lru leaving an old page in memcached to be retrieved later on.

So I was thinking why not bypass the LRU process? While I feel this is the most efficient way to do this, its not the only way. I modified innodb to use the default LRU code and then modified the page get to push to memcached on any disk read. Additionally I added …

Waffle: The Mystery Continues

So I spent the weekend looking at places where we may have missed something in the code for waffle. You can actually see some of the stuff I tried in the bug on launchpad about this, but the weird thing is the very last thing I tried. As I took a step back and looked at the problem ( secondary index corruption ) and our assumption that we “missed” something, I decided to find the place where pages are written to disk and to push to memcached from here as well as from the LRU. With the double write buffer enabled that place should be buf_flush_buffered_writes. By pushing to memcached here we should eliminate the page that falls through the cracks of the LRU. Basically this should help ensure memcached has an exact copy of the data that exists on disk. The result? It failed with the same secondary index failure. This means:

a.) maybe we have a problem in the …

Waffle: limiting the space ids being pushed to memcached

If you read Yves blog post about waffle yesterday we are seeing some weird gremlins in the system and could use some scoobey doo detective work if you have some ideas. The strange thing is it only exhibits under high load. So it really seems like we may have missed some background cleanup process that accesses or removes pages from disk or the buffer pool without going through the functions we call waffle in (buf_LRU_search_and_free_block & buf_read_page_low ).

One of the idea’s I had was trying to narrow the scope of what’s being pushed and read form Memcached. Even though I am using file per table, system tablespace pages are still making it in and out of memcached. I thought if we missed something maybe it was here ( even though I could not find it in the code ). I mean cleaning up undo or internal data would seem like a logical place to miss something. So I hacked Waffle …

Sun/Intel X-25e 4 Disk Raid 10 tests - part 2

So lets test some different configurations and try and build some best practices around Multiple SSD’s:

Which is better? Raid 5 or Raid 10?

As with regular disks, Raid 10 seems to performance better ( accept for pure reads ).  I did get a lot of movement test to test like with the 67% read test -vs- the 75% or 80% tests. But all in all RAID 10 seemed to be the optimal config.

Should you enable the controller cache? One of the things I have found in my single drive tests is that “dumb” controllers tend to give better performance numbers then “smart” controllers. Really expensive controllers tend to have extra logic to compensate for the limitations of traditional disk. So I decided to play with some of the controller options. The obvious one is cache on the controller.

Some tests showed substantially better performance when the disk cache was disabled ( both read & write ).

If better …

x-25e, 25% reduction in random writes…

So in my previous post I showed some benchmarks showing a large drop off in performance when you fill the x-25e. I wanted to followup and say this: even if you do everything correctly ( i.e. leave 50%+ space free, disable controller cache etc ) you may still see a drop in performance if your workload is heavily write skewed.  To show this I ran a 100% random read sysbench fileio test over a 12GB dataset (37.5% full ) , the tests were run back-to-back over a several hours , here is what we see:

*Note the scale is a little skewed here ( i start at 2500 reqs ).

Each data point represents 2 million IO’s, so somewhere after about 6 million IO’s we start to drop.  At the end it looks like we stabilize around2900-3000 requests per second, an overall drop of about 25%.

Intel X-25e and Mysql Part 1b: Don’t let your Drive Over Eat!

The plan was only to do two quick posts on RAID Performance on the X-25e, but this was compelling enough to post on it’s own.  So in part I Mark Callaghan asked hey what gives with  the SLC Intel’s single drive random write performance,  It’s  lower then the MLC drive.   To be completely honest with you I had overlooked it, after all I was focusing on RAID performance.  This was  my mistake because this is actually caused by one of the Achilles heals of most flash on the market today, crappy performance when you fill more of the drive.  I don’t really know what the official title for it is but I will call it “Drive Overeating”.

Let me try and put this simply:  a quick trick most vendors use to push better random write #’s and help wear leveling is to not …

