Kettle workshop at KHM

Good news Kettle fans!

Our community is bound to become a bit larger as a whole group of students (38) at the Katholieke Hogeschool Mechelen (Batchelor level) will receive a one day workshop with Pentaho Data Integration (Kettle).  This workshop will take place in early November, most likely the 4th.

It’s interesting to see that during that day we’ll be able to go through most of the work involved in reading and staging the data, data cleansing and a few slowly changing dimensions with a fact table.  On top of that we’ll explain how to use Pentaho Data Integration in that setting.  When time permits we’ll show how to set up a metadata model on top of that data to create reports on it.  On top of that the students will get an idea about what exactly open source is all about.

Dead wrong

Belgian consultancy company Element 61 has just posted an opinion piece under the disguise of a review on open source ETL.

What a load of utter nonsens.  Try reading this:

Instead of using SQL statements to transform data, an Open Source ETL tool gives the developer a standard set of functions, error handling rules and database connections. The integration of all these different components is done by the Open Source ETL tool provider. The straightforward transformations can be implemented very quickly, without the hassle of writing queries, connecting to data sources or writing your own error handling process. When there are complex transformations to make, Open Source ETL tools will often not offer out-of-the-box solutions.

T-Dose 2008

Roland Bouman and I will be doing a presentation together at T-Dose on October 25th:

Building Open Source BI solutions with Pentaho and MySQL

It’s a free conference, feel free to join us there for a chat and/or a drink!

Until then,

Getting started with Kettle

For those people starting with Kettle (Pentaho Data Integration) we created a Getting Started page on our Wiki.

Since I realized that for some people, simple and easy can never be simple and easy enough I created 8 mini-flash demos :

A case for Kettle for your next ETL or data warehouse project

I am, for the most part, a do-it-yourself type of person. I fix my own car if I can; I even have four healthy tomato plants growing in pots outside as we speak — the plants will take that little extra CO2 out of the air and give me great tasting tomatoes (soon… i hope!)

But I digress.

Whether to use an ETL tool such as Kettle (aka Penatho Data Integration) for a project involving large data transfers is a typical “build vs. buy” type of decision, one that is fairly well understood and I don’t wish to repeat it all here — putting together some Perl scripts to do the job, you typically get great performance, development speed and accessibility. This would need to be balanced against the benefits of ETL tools and their potential drawbacks (development speed, license costs and performance …

