Showing entries 31 to 40 of 78
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Pentaho (reset)
451 CAOS Links 2010.02.02

Oracle’s plans for Sun’s OSS. The UK’s updated OSS strategy. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

Oracle’s plans for Sun’s OSS
# Oracle’s MySQL strategy slide.

# eWeek reported that database thought leaders are divided on Oracle MySQL.

# Savio Rodrigues and Computerworld on Oracle’s plans for MySQL, other open source assets.

# Zack Urlocker is leaving Oracle/Sun/MySQL.

[Read more]
Encrypt PDI passwords

PDI has a basic obfuscation method for making it difficult for casual people to lift passwords for DB connections. I have customers that maintain different versions of a “shared.xml” file that maintain different physical connections to databases (think development, QA/testing, and production).

In order to generate the different shared.xml, a user has to usually (per Matt Casters comment below there is a utility that allows user to do this outside of Spoon) open up PDI, created the connections, save them, and then sometimes copy and paste the sections needed to create their “dev” version of shared.xml or their “production” version of shared.xml. Many times this just to generate the password, as they can hand edit the other pieces (hostname, schema, etc).

I just committed a quick little PDI …

[Read more]
Re-Introducing UDJC

Dear Kettle fans,

Daniel & I had a lot of fun in Orlando last week. Among other things we worked on the User Defined Java Class (UDJC) step.  If you have a bit of Java Experience, this step allows you to quickly write your own plugin in a step. This step is available in recent builds of Pentaho Data Integration (Kettle) version 4.

Now, how does this work?  Well, let’s take Roland Bouman’s example : the calculation of the the date of Easter.  In this blog post, Roland explains how to calculate Easter in MySQL and Kettle using JavaScript.  OK, so what if you want this calculation to be really fast in Kettle?  Well, then you can turn to pure Java to do the job…

[Read more]
A guide to The 451 Group’s open source software coverage

Regular visitors to the 451 CAOS Theory blog will be well aware of The 451 Group’s CAOS (Commercial Adoption of Open Source) research service and our CAOS long-form reports.

They are probably less aware of the open source coverage that The 451 Group provides on a day-to-day and week-to-week basis, however, and I thought it would be worthwhile to provide some examples of The 451 Group’s ongoing open source coverage by highlighting a few recent reports.

The company’s core services are 451 Market Insight Service, which delivers daily insight into emerging enterprise IT markets, and 451 TechDealmaker, a forward-looking weekly …

[Read more]
A review of Pentaho Solutions by Roland Bouman and Jos van Dongen

Pentaho Solutions

Pentaho Solutions, Business Intelligence and Data Warehousing with Pentaho and MySQL. By Roland Bouman and Jos van Dongen, Wiley 2009. Page count: about 570 pages. (Here’s a link to the publisher’s site.)

The book is big in part because it’s about a GUI tool, so there are the requisite number of screenshots (but not too many). It is structured into four parts, each on a different topic.

The first part is 4 chapters on getting started with Pentaho: from a quick-start through …

[Read more]
451 CAOS Links 2009.11.06

Funambol acquires Zapatec. Open source gains Closure. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

For the latest on Oracle’s acquisition of MySQL via Sun, see Everything you always wanted to know about MySQL but were afraid to ask

# Funambol acquired Zapatec, an AJAX web 2.0 frameworks vendor.

# The top ten issues facing open source users, according to Mark Radcliffe.

# Google …

[Read more]
Instant Relief from Slow MySQL Reporting Queries using DynamoDB

Here’s the scenario. You’ve got a table in MySQL for reporting that has a few million rows, and is denormalized for reporting. You’ve got a Pentaho Report that is querying this MySQL table. You have two problems with the current report.

  1. Your users are complaining that the query is slow, and they have to wait around for longer than they’d like to see their report. (approx 40s)
  2. Your DBAs are cranky because they see the size of this table is getting bigger. (approx 1.8GB)

MySQL is fundamentally designed to be an OLTP database and while it does a fantastic job at that, its data warehouse features were built as “bolt on” additions. Can it be used for BI? Absolutely, I’ve used it a many customer sites. Does DynamoDB provide a better set of features/capabilities for doing BI? We think so! Are they both 100% open source? You bet;why not choose the right tool for the …

[Read more]
Setting up Development and Production Pentaho PDI Repositories

I’ve been setting up a Pentaho Data Integration system with the goals of supporting collaboration with my team, allowing easy deployment to test or production, and enabling remote monitoring and troubleshooting of jobs and tranformations.

I’ve finally figured out a way to achieve these goals, so I’ll try to pass this on now. I found the book "Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL", by Roland Bouman and Jos van Dongen to be a big help in figuring out how to export/import. It definitely helped me get up and running quickly.

My first decision was to bet the farm on the use of a repository. A file based system would probably work, but I felt that it would require too much file distribution and usage of remote terminals. So I’ve setup two separate repositories hosted on MySQL databases: One for development (DEV), and one for Production (PRD). Here are the steps I …

[Read more]
Displaying stored procedure result set meta data in Pentaho

This probably won't be a very well written post since I am working frantically on a proof of concept using Pentaho Data Integration / Kettle for the etl in a new data warehouse project. I have just a couple days to get it to work or I'll end up having to use perl...which will make me hurl.

I want to use a mysql stored procedure for the transformation input, which is easy to do with the "Table Input" step ( just CALL the stored proc in the SQL section), but the field names of the result set don't show up downstream in subsequent steps. When I right click on a downstream step and select "show input fields", an "I Can't find any fields" messagebox pops up.

Some may find this a minor annoyance, but it makes subsequent steps difficult to deal with if you cant visualize the structure of the data stream in the transformation.

I saw some posts recommending the use of a "Select Values" Step, but for some reason, I …

[Read more]
451 CAOS Links 2009.10.13

Larry Ellison promises funds for MySQL, commits to community. The “open source vendor” debate in a nutshell. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

# Larry Ellison promised MySQL will receive more money for development and research, while Oracle maintained that it is committed to Java and open source developer communities.

# GroundWork raised $5m series D funding from Canaan Partners, Mayfield, JAFCO Ventures and SAP Ventures.

# InformationWeek reported that Motorola has vacated …

[Read more]
Showing entries 31 to 40 of 78
« 10 Newer Entries | 10 Older Entries »