Showing entries 41 to 50 of 87
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: Data Integration (reset)
Graph Databases and the Future of Large-Scale Knowledge Management

Image via Wikipedia

Todd Hoff has posted a link to a Los Alamos National Lab presentation on Graph Databases.  In this paper they provide a revisit on the classic RDBMS vs Graph database debate.

The Relational Database hasn’t maintained its dominance out of dumb luck.  Instead the RDBMS has consistently outperformed while providing the most general use capability of all the variety of platforms that have been available.  Many other approaches have been tried, often these have provided better object model integration (OODBMS) or better data model representation.  But when the …

[Read more]
The Argument For & Against Map/Reduce

The last 24 months has seen the introduction of Map/Reduce functionality into the data processing arena in various forms.  Map/Reduce is a framework for developing scalable data processing functionality, and was popularized by Google (see this earlier post).

Pure players like Hadoop are starting to find their own niche, helped by organizations such as Cloudera.  However there has been a number of for & against arguments relating to Map/Reduce functionality inside the database.

These arguments are now really serving a moot point.  Customers have recognized value in Map/Reduce prompting some (b)leading edge database vendors to introduce such …

[Read more]
Top 10 interesting companies in Data Management

A bit of fun for a Sunday.  Below is the list of my top 10 interesting companies in Data Management right now.  Interesting to me means doing new stuff and being somewhat disruptive, or have a “watch and see” quality about them.  Note this is about companies not data management applications. 

While I find a bunch of other data management applications interesting (PNUTS, Cassandra, Redis etc) these aren’t really encapsulated in a company with a go to market strategy.

10gen - They are making interesting noises not sure about delivery yet
Amazon – SimpleDB is neat, but not a grown up data platform yet
Aster Data – Doing funky things with Map/Reduce
GroovyChannel – Are they nuts, they have to change …

[Read more]
Google Goodies and Lego

Dear Kettle friends,

Will Gorman and Mike D’Amour, Senior Developers at Pentaho, are presenting Pentaho’s Google integration work at the Google I/O Developer Conference. (at the Sandbox area to be specific)   Yesterday, Pentaho announced that much.

Here are a few of the integration points:

  • Google maps dashboard (available in the Pentaho BI server you can download)
  • A new Google Docs step was created for Pentaho Data Integration Enterprise Edition
  • Running (AVI, 30MB) the …
[Read more]
PDI cloud : massive performance roundup

Dear Kettle fans,

As expected there was a lot of interest in cloud computing at the MySQL conference last week.  It felt really good to be able to pass the Bayon Technologies white paper around to friends, contacts and analysts.  It’s one thing to demonstrate a certain scalability on your blog, it’s another entirely to have a smart man like Nicholas Goodman do the math.

Sorting massive amounts of rows is hard problem to take on.  Making it scale on low-cost EC2 instances is interesting as it …

[Read more]
Next week : MySQL UC

Dear Kettle & MySQL fans!

I’m really looking forward to go to the MySQL User Conference next week, not just because I’m speaking in 2 sessions again, but perhaps also because these are “interesting” times for MySQL and Sun Microsystems.  Pivotal times it would seem.

Here are the 2 sessions I’m going to do:

  • Cloud Computing with MySQL and Kettle : I’m particularly happy that MySQL accepted this session: it will demonstrate how easy it has become to do cloud computing exercises with tools like MySQL and Kettle.
[Read more]
Resource exporter

Dear Kettle fans,

One of the things that’s been on my TODO list for a while was the creation of a resource exporter

Resource exporter?

It’s called “Resource exporter” and not “Job exporter” or “Transformation exporter” because it is intended to export more than just a single job or transformation.  It exports all linked resources of a job or transformation.

The means that if you have a job that has 5 transformation job entries, you will be exporting 6 resources (1 job and 5 transformations).  If those transformations use 3 sub-transformations (mappings) you will in total export 9 resources.

The whole idea behind this exercise is to be able to create a package (for example to send to someone) that has all needed resources contained in a single zip file.

Let’s …

[Read more]
Pentaho Partner Summit ‘09

Dear reader,

In a little over 3 weeks, April 2nd and 3rd, we’re organizing a Pentaho Partner Summit at the Quadrus Conference Center in Menlo Park near San Francisco.

If you are (as the invitation describes) an “Executive, luminary, current or prospective partner from around the world” and if you come over you’ll meet myself, Julian Hyde and perhaps a couple of other architects as well.  That is outside of a host of other interesting people like Zack Urlocker (MySQL) and of course Richard Daley our CEO. We’ll be doing a couple of lengthy sessions on Kettle and Mondrian …

[Read more]
Is the Relational Database Doomed?

Recently, a lot of new non-relational databases have cropped up both inside and outside the cloud. One key message this sends is, "if you want vast, on-demand scalability, you need a non-relational database".

If that is true, then is this a sign that the once mighty relational database finally has a chink in its armor? Is this a sign that relational databases have had their day and will decline over time? In this post, we'll look at the current trend of moving away from relational databases in certain situations and what this means for the future of the relational database.[more]

Kickfire: Data Analytics for the Masses

You may not realize it, but the data analytics market is buzzing. There are new vendors emerging, new products popping up, new deals being done, and several new strategies being pursued. Vendors are predominately chasing big data, with battles lines being drawn by solution providers that cater to between roughly 100 TB and 10 PB data sets. The battle was inevitable because the world is producing data at a phenomenal rate, and we have an increasing need to analyze them within shorter time frames. In this post we analyze one of these vendors, Kickfire.

Yet while the big names in town are capturing the headlines, in reality only a small percentage of businesses today need to be able to analyze petabytes of data. Today, the rest of us are more likely to deal with analytic data sets in the 50 GB to 3 TB range.

Kickfire is interesting because it has decided to let the other vendors fight it out for the massive data …

[Read more]
Showing entries 41 to 50 of 87
« 10 Newer Entries | 10 Older Entries »