Showing entries 1 to 10 of 12
2 Older Entries »
Displaying posts with tag: innodb internals (reset)
Innodb Compression: When More is Less

So Vadim posted on the MySQL Performance Blog about poor benchmarks when running innodb compressed pages.  I ran some tests a few weeks ago and did not see the same results as him and checked into my previous tests and compared them to his #’s.  In a round about way verifying his thoughts on Mutex contention I found that increasing the BP sized with compressed data decreases the transactional throughput. The test was run with an uncomressed data set size of 6GB, 3.1GB compressed read-only.

2G, NOZIP 3217.19
[Read more]
I thought InnoDB page size was always 16k ...

Let me just start off this post saying if you're not interested in InnoDB tablespace internals, then this might not be the post for you. :)

At any rate, whilst examining individual tablespace files (the .ibd file created when running with innodb_file_per_table option enabled) in their binary format, I noticed the initial page size for a individual tablespace did not appear to be 16k (as the default InnoDB page size is 16k).

Upon examining the actual binary data stored in the data file (just create a basic table and insert 10 rows), you can see data is written at the very beginning of the file (position 0).

One can also see data appearing at several (4) other places in this file.

I was trying to sync up what I was seeing with these InnoDB internals:

[Read more]
Filtering by table is now possible with WaffleGrid

Since I have been a home recently, I put some time correcting bugs in WaffleGrid and adding new features. Thanks to gdb, I have been able to understand a silly bug that was affecting WaffleGrid with sysbench but, weird enough, not with dbt2. Everything is in the way connections are established. I will blog more about that soon.

Regarding the new features, it is now possible to choose which tables you want to push to memcached. For that purpose, two new parameters have been introduce:

innodb_memcached_table_filter_enable = 0 | 1   (default to 0)

to enable the filtering and

innodb_memcached_table_list = db1/table1,db2/table2   

to list the tables. This feature is filtering based on the space id so, innodb_file_per_table has to be set. Right now, the association table space_id is done only at startup so, the table has to exist. Also, since an alter table change the space_id… you need to restart MySQL …

[Read more]
You know you have really screwed up when …

You crash MySQL/Innodb and you trace the error to a function that says:

“This function checks the consistency of an index page when we do not
know the index. This is also resilient so that this should never crash
even if the page is total garbage. ”

Oops… I guess its not that resilient after all:)

Waffle: Progress and a Rearchtecture?

So I spent several hours over the last few days on the Secondary index bug. Out of frustration I decided to try and bypass the LRU concept all together and try going to a true secondary page cache. In standard Waffle a page is written to memcached only when it is expunged ( or LRU’d ) from the main buffer pool. This means anything in the BP should not be in memcached. Obviously with this approach we missed something, as Heikii pointed out in a comment to a previous post, it seems likely we are getting an old version of a page. Logically this could happen if we do not correctly expire a page on get or we bypass a push/lru leaving an old page in memcached to be retrieved later on.

So I was thinking why not bypass the LRU process? While I feel this is the most efficient way to do this, its not the only way. I modified innodb to use the default LRU code and then modified the page get to push to memcached on any disk read. Additionally I added …

[Read more]
Waffle: The Mystery Continues

So I spent the weekend looking at places where we may have missed something in the code for waffle. You can actually see some of the stuff I tried in the bug on launchpad about this, but the weird thing is the very last thing I tried. As I took a step back and looked at the problem ( secondary index corruption ) and our assumption that we “missed” something, I decided to find the place where pages are written to disk and to push to memcached from here as well as from the LRU. With the double write buffer enabled that place should be buf_flush_buffered_writes. By pushing to memcached here we should eliminate the page that falls through the cracks of the LRU. Basically this should help ensure memcached has an exact copy of the data that exists on disk. The result? It failed with the same secondary index failure. This means:

a.) maybe we have a problem in the …

[Read more]
Waffle: limiting the space ids being pushed to memcached

If you read Yves blog post about waffle yesterday we are seeing some weird gremlins in the system and could use some scoobey doo detective work if you have some ideas. The strange thing is it only exhibits under high load. So it really seems like we may have missed some background cleanup process that accesses or removes pages from disk or the buffer pool without going through the functions we call waffle in (buf_LRU_search_and_free_block & buf_read_page_low ).

One of the idea’s I had was trying to narrow the scope of what’s being pushed and read form Memcached. Even though I am using file per table, system tablespace pages are still making it in and out of memcached. I thought if we missed something maybe it was here ( even though I could not find it in the code ). I mean cleaning up undo or internal data would seem like a logical place to miss something. So I hacked …

[Read more]
Oracle really….

Ummmmmfff…. wind out of my sails today. I believe their was an analyst who predicted this a year ago when Sun bought MySQL…. I wish I could find the article. I am going to head down to the UC floor and get a feel for what people are thinking… will post more as I get time.

How much does it cost to update an index?

I was asked today about what is the cost of adding an index on a frequently updated column ( like a timestamp, count, or weight )… typically my answer is it depends. But for this question it was narrowed down to a specific case. An update on a secondary index based on a PK lookup. I decided to try and give an exact answer. I hacked the Juice DB Benchmark to attack my medium sized table ( which magically already had a count column in it ). I then cranked up the test. A few more details Query 23 updated a column without an index, queries 21,23,24 updated the d_count column. query 21 adds 5 to the count, query 22 adds 150, query 24 subtracts 1…. here are the results:

With a solo index on d_count:

Run Number:  86  threads:  8 Length :  340 LoadType: upd
Total Test Runtime = 375.245010137558 seconds, limiting results to 300 seconds however
QNum:     21 ... QCount:  78448 ... QTime:   0.003985 ... Max:   0.095937 ... FlatTime:   0.003673 …
[Read more]
Waffle Grid: Async IO Concerns

Last night I was on a plane to a client site and I was reviewing the waffle grid code for inclusion of multi-gets when calling read-ahead functions (potentially in other area’s as well)…  their I noticed something that may slow down our performance in waffle grid.  You see we read from memcached in the buf_read_page_low function.  This function is responsible for checking wether a page exists in the buffer and if not making an IO request for it via the function fil_io.  What I saw in buf_read_page_low was:  we check the buffer, then check memcached, then go get the page off disk… you follow?  Alright The get page from disk part is challenge.  Let me do a quick deep dive on the internal function calls.

You see buf_read_page_low is called by a few different functions in a few different ways, the ones I am concerned with are the functions that are pasing a sync of false ( you do not want synchronous IO, …

[Read more]
Showing entries 1 to 10 of 12
2 Older Entries »