I have started seriously using PHP 5.3 recently due to it finally
making it into Portage. (Gentoo really isn't full of
bleeding edge packages people.) I have used mysqlnd a little here
and there in the past, but until it was really coming to my
servers I did not put too much time into it.
What is mysqlnd?
mysqlnd is short for MySQL Native Driver. In short, it is a
driver for MySQL for PHP that uses internal functions of the PHP
engine rather than using the externally linked libmysqlclient
that has been used in the past. There are two reasons for this.
The first reason is licensing. MySQL is a GPL project. The GPL
and the PHP License don't play well together. The second is
better memory management and hopefully more performance. Being a
performance junky, this is what peaked my interests. Enabling
mysqlnd means it is used by the older MySQL extension, the newer
MySQLi extension and the MySQL PDO driver.
New Key Feature - fetch_all
One new feature of mysqlnd was the fetch_all method on MySQLi
Result objects. At both dealnews.com and in Phorum I have written
a function to simply run a query and fetch all the results into
an array and return it. It is a common operation when writing API
or ORM layers. mysqlnd introduces a native fetch_all method that
does this all in the extension. No PHP code needed. PDO already
offers a fetchAll method, but PDO comes with a little more
overhead than the native extensions and I have been using mysql
functions for 14 years. I am very happy using them.
Store Result vs. Use Result
I have spoken in the past (see my slides and interview: MySQL
Tips and Tricks) about using mysql_unbuffered_query or using
mysqli_query with the MYSQLI_USE_RESULT flag. Without going into
a whole post about that topic, it basically allows you to stream
the results from MySQL back into your PHP code rather than having
them buffered in memory. In the case of libmysqlclient, they
could be buffered twice. So, my natural thought was that using
MYSQLI_USE_RESULT with fetch_all would yield the most awesome
performance ever. The data would not be buffered and it would get
put into a PHP array in C instead of native code. The code I had
hoped to use would look like:
$res = $db->query($sql, MYSQLI_USE_RESULT);
$rows = $res->fetch_all(MYSQLI_ASSOC);
But, I quickly found out that this does not work. For some
reason, this is not supported. fetch_all only works with the
default which is MYSQLI_STORE_RESULT. I filed a bug which was
marked bogus. Which I put back to new because I really don't see
a reason this should not work other than a complete oversight by
the mysqlnd developers. So, I started doing some tests in hopes I
could show the developers how much faster using MYSQLI_USE_RESULT
could be. What happened next was not expected. I ended up
benchmarking several different options for fetching all the rows
of a result into an array.
Test Data
I tested using PHP 5.3.3 and MySQL 5.1.44 using InnoDB tables.
For test data I made a table that has one varchar(255) column. I
filled that table with 30k rows of random lengths between 10 and
255 characters. I then selected all rows and fetched them using 4
different methods.
- mysqli_result::fetch_all*
- PDOStatement::fetchAll
- mysqli_query with MYSQLI_STORE_RESULT followed by a loop
- mysqli_query with MYSQLI_USE_RESULT followed by a loop
In addition, I ran this test with mysqlnd enabled and disabled.
For mysqli_result::fetch_all, only mysqlnd was tested as it is
only available with mysqlnd. I ran each test 6 times and threw
out the worst and best result for each test. FWIW, the best and
worst did not show any major deviation for any of the tests. For
measuring memory usage, I read the VmRSS value from Linux's /proc
data. memory_get_usage() does not show the hidden memory used by
libmysqlclient and does not seem to show all the memory used by
mysqlnd either.
So, that is what I found. The memory usage graphs are all what I
thought they would be. PDO has more overhead by its nature.
Storing the result always uses more memory than using it.
mysqli_result::fetch_all uses less memory than the loop, but more
than directly using the results.
There are some very surprising things in the timing graphs
however. First, the tried and true method of using the result
followed by a loop is clearly still the right choice in
libmysqlclient. However, it is a horrible choice for mysqlnd. I
don't really see why this is so. It is nearly twice as slow.
There is something really, really wrong with MYSQLI_USE_RESULT in
mysqlnd. There is no reason it should ever be slower than storing
the result and then reading it again. This is also evidenced in
the poor performance of PDO (since even PDO uses mysqlnd when
enabled). PDO uses an unbuffered query for its fetchAll method
and it too got slower. It is noticably slower than
libmysqlclient. The good news I guess is that if you are using
mysqlnd, the fetch_all method is the best option for getting all
the data back.
Next Steps
My next steps from here will be to find some real workloads that
I can test this on. Phorum has several places where I can apply
real world pages loads to these different methods and see how
they perform. Perhaps the test data is too small. Perhaps the
number of columns would have a different effect. I am not
sure.
If you are reading this and have worked on or looked at the
mysqlnd code and can explain any of it, please feel free to
comment.