Showing entries 1 to 10 of 75
10 Older Entries »
Displaying posts with tag: Pentaho (reset)
MDX: retrieving the entire hierarchy path with Ancestors()

A couple of days ago I wrote about one of my forays into MDX land (Retrieving denormalized tabular results with MDX). The topic of that post was how to write MDX so as to retrieve the kind of flat, tabular results one gets from SQL queries. An essential point of that solution was the MDX Ancestor() function.

I stumbled upon the topic of my previous blogpost while I was researching something else entirely. Creating flat tables and looking up individual ancestors is actually a rather specific application of a much more general solution I found initially. Pivot tables and the "Show Parents" functionalityGUI OLAP tools typically offer a pivot table query interface. They let you drag and drop measures and dimension items, like …

[Read more]
The Data Day, A few days: April 22-26 2013

Pivotal launches. SkySQL and Mony Program merge. And much, much more

Our report on the changes in the MySQL ecosystem is now available for 451 clients and non-clients alike at

— Matt Aslett (@maslett) April 25, 2013

For 451 Research clients: VMware expands Serengeti’s horizons with updated Hadoop virtualization project

— Matt Aslett (@maslett) April 26, 2013

For 451 Research clients: SkySQL, Monty Program merge to support MariaDB following formation of MariaDB Foundation

— Matt Aslett (@maslett) …

[Read more]
451 CAOS Links 2011.11.01

Appcelerator raises $15m. Hortonworks launches Data Platform. And more.

# Appcelerator raised $15m in a third round led by Mayfield Fund, Translink Capital and Red Hat.

# Modo Labs closed a $4m investment from Storm Ventures and New Magellan Ventures.

# Hortonworks launched its Hortonworks Data Platform Apache Hadoop distribution, as well as a new partner program. Eric Baldeschwieler put the …

[Read more]
451 CAOS Links 2011.10.07

OpenStack Foundation. New Pentaho CEO. And more.

# Rackspace announced its intention to form an independent OpenStack Foundation.

# HP has chosen Ubuntu as the lead host and guest operating system for its Public Cloud.

# Pentaho appointed Quentin Gallivan as its new CEO.

# Hortonworks continued the discussion about contributions to Apache Hadoop.

# Bob Bickel explained why CloudBees is not, itself, open …

[Read more]
Proposals for Codebits.EU

Codebits is an annual 3-day conference about software and, well, code. It's organized by SAPO and this year's edition is to be held on November 10 thru 12 at the Pavilhão Atlântico, Sala Tejo in Lisbon, Portugal.

I've never attended SAPO Codebits before, but I heard good things about it from Datacharmer Giuseppe Maxia. The interesting thing about the way this conference is organized is that all proposals are available to the public, which can also vote for the proposals. This year's proposals are looking very interesting already, with high quality proposals from …

[Read more]
451 CAOS Links 2011.08.09

Opscode appoints a new CEO. SugarCRM gains a new CFO. And more.

# Opscode named Mitch Hill as CEO, with Jesse Robbins becoming Chief Community Officer.

# SugarCRM claimed billings up 58% in Q2 and appointed a new CFO.

# Tasktop released Tasktop Dev 2.1 and announced Tasktop Sync 1.0.

# Pentaho delivered improved support for Hadoop …

[Read more]
Real-time streaming data aggregation

Dear Kettle users,

Most of you usually use a data integration engine to process data in a batch-oriented way.  Pentaho Data Integration (Kettle) is typically deployed to run monthly, nightly, hourly workloads.  Sometimes folks run micro-batches of work every minute or so.  However, it’s lesser known that our beloved transformation engine can also be used to stream data indefinitely (never ending) from a source to a target.  This sort of data integration is sometimes referred to as being “streaming“, “real-time“, “near real-time“, “continuous” and so on.  Typical examples of situations where you have a never-ending supply of data that needs to be processed the instance it becomes available are JMS (Java Message Service), RDBMS log sniffing, on-line fraud analyses, web or application …

[Read more]
451 CAOS Links 2011.07.01

A herd of Hadoop announcements. Rockmelt raises $30m. And more.

A herd of Hadoop announcements
# Yahoo! and Benchmark Capital confirmed the formation of Hortonworks, an independent company focused on the development and support of Apache Hadoop.

# Cloudera announced the availability of Cloudera Enterprise 3.5 and the launch of Cloudera SCM Express, based on the new Service and Configuration Manager in Cloudera Enterprise 3.5.

# MapR …

[Read more]
PDI Loading into LucidDB

By far, the most popular way for PDI users to load data into LucidDB is to use the PDI Streaming Loader. The streaming loader is a native PDI step that:

  • Enables high performance loading, directly over the network without the need for intermediate IO and shipping of data files.
  • Lets users choose more interesting (from a DW perspective) loading type into tables. In particular, in addition to simple INSERTs it allows for MERGE (aka UPSERT) and also UPDATE. All done, in the same, bulk loader.
  • Enables the metadata for the load to be managed, scheduled, and run in PDI.

However, we’ve had some known issues. In fact, until PDI 4.2 GA and LucidDB 0.9.4 GA it’s pretty problematic unless you run through the process of patching LucidDB outlined on this page: …

[Read more]
HPCC vs Hadoop at a glance


Since this article was written, HPCC has undergone a number of significant changes and updates. This addresses some of the critique voiced in this blog post, such as the license (updated from AGPL to Apache 2.0) and integration with other tools. For more information, refer to the comments placed by Flavio Villanustre and Azana Baksh.

The original article can be read unaltered below:

Yesterday I noticed this tweet by Andrei Savu: . This prompted me to read the related GigaOM article and then check out the HPCC Systems …

[Read more]
Showing entries 1 to 10 of 75
10 Older Entries »