I’m visiting a Pentaho customer right now whose current “transaction” volume is 200 million rows per day. Relatively speaking, this puts their planned warehouse in the top quintile of size. They will face significant issues with load times, data storage, processing reliability, etc. Kettle is the tool they selected and it is working really well. Distributed record processing using Kettle and a FOSS database is a classic case study for Martens scale out manifesto.
This organization doesn’t have unlimited budget. Specifically, they don’t have a telecom type budget for their telecom like volume of data. One of the issues that has come up with their implementation has been the tradeoff between space, and keeping the …
[Read more]