A collague at Sun asked me for tips on how to tune MySQL to get fast bulk loads of csv files. (The use case being, insert data into a data warehouse during a nightly load window.) Considering that I spend most of my time working with MySQL Cluster, I was amazed at how many tips I could already come up with both for MyISAM and InnoDB. So I thought it might be interesting to share, and also: Do you have any more tips to add?
[A Sun partner] have requested to do a POC to test mysql's bulk
loading capabilities from CSV files. They have about 20GB of
compressed CSV files, and they want to see how long it takes to
load them.
They haven't specified which storage engine they intend to use
yet.
Good start, well defined PoC. The storage engine question actually is significant here.
- MyISAM typically may be up to twice as fast for bulk loads compared to InnoDB. (But there may be some tuning that makes …
[Read more]