This week we are talking about size, which is a subject that should matter to any system administrator in charge of the backup system of any project, and in particular database backups.
I sometimes get questions about what should be the best compression tool to apply during a particular backup system: gzip? bzip2? any other?
The testing environment
In order to test several formats and tools, I created a .csv file (comma-separated values) that was 3,700,635,579 bytes in size by transforming a recent dump of all the OpenStreetMap nodes of the European portion of Spain. It had a total of 46,741,126 rows and looked like this:
171773 38.6048402 -0.0489871 4 2012-08-25 00:37:46 12850816 472193 rubensd 171774 38.6061981 -0.0496867 2 2008-01-19 10:23:21 666916 9250 …[Read more]