Everyone knows the Internet Movie Database, and I wanted to use their public data as a large data set to create some complex queries. The MySQL world database is excellent to teach SQL basics, but not sufficient to teach performance tuning or real life queries.
Since the data comes in a bunch of clumsy text files, it is not easy to parse and import them. Fortunately, this work was done by some German guys. All you need, is their JMDB software, a java environment, of course a MySQL database and... lots of time.
They claim that it's possible to load all the data within 2 hours, but it took my little laptop almost a day. After the import procedure, you end up with a database called "jmdb" and a nice number of huge tables. In total, it's more than 20 million rows in 44 tables.
It's all MyISAM tables, which is fast enough as the import doesn't run parallel queries. And for reading, MyISAM will also be the best option, …
[Read more]