First, a retraction, it turns out that the performance problem with datatimes in the previous article wasn’t due to high cardinality (I speculated too much here), but due to a type conversion issue. From a helpful comment from Victoria Eastwood of Infobright (a good sign for a startup), the Infobright engine considered ‘2001-01-01’ to be a date, not a datetime, and it couldn’t do a conversion to a datetime. Instead it pushed the date filtering logic from the Infobright engine to MySQL. Effectively, the slow queries were a table scan. The solution is to add the 00:00:00 to the dates to make them datetimes.
With that in mind, here are some much better numbers for Infobright. For Infobright this query took 0.05 seconds.
1) Select sum(unit) from Sale where purchaseDate >= '2001-04-01 00:00:00' and purchaseDate < '2001-05-01 00:00:00'
This compares very …
[Read more]