We’re working with web site preparing for massive growth. To make sure it handles large data sets as part of the process we work on generation test database of significant size as testing your application on table with 1000 rows may well give you very dangerous false sense of security.
One of the process web site had was creating of summary tables which was done by executing some huge group by query, doing some stuff with results and then populating tables. This all worked well for small tables… but not for larger ones.
First problem was PHP script generating the table took 10GB of RAM and was swapping development server which had just 4GB of Ram (and plenty of swap space) like crazy. Why ? Because by default mysql_query uses mysql_store_result C library call and buffers all result set in the process memory. Not good if there are over 50 millions of rows. Note this limit is not controlled by …
[Read more]