Showing entries 1 to 10 of 31
10 Older Entries »
Displaying posts with tag: talend (reset)
Big Data Integration & ETL - Moving Live Clickstream Data from MongoDB to Hadoop for Analytics

June 16, 2014 By Severalnines

MongoDB is great at storing clickstream data, but using it to analyze millions of documents can be challenging. Hadoop provides a way of processing and analyzing data at large scale. Since it is a parallel system, workloads can be split on multiple nodes and computations on large datasets can be done in relatively short timeframes. MongoDB data can be moved into Hadoop using ETL tools like Talend or Pentaho Data Integration (Kettle).

 

In this blog, we’ll show you how to integrate your MongoDB and Hadoop datastores using Talend. We have a MongoDB database collecting clickstream data from several websites. We’ll create a job in Talend to extract the documents from MongoDB, transform and then load them into HDFS. We will also show you how to schedule this job to be executed every 5 minutes.

 

Test Case

 

We have an application …

[Read more]
The Data Day, Two days: February 13/14 2013

TempoDB’s timely DBaaS for the Internet of Things. ScaleBase 2.0. And more

For 451 Research clients: TempoDB has timely database service for the Internet of Things bit.ly/YcQuqA

— Matt Aslett (@maslett) February 13, 2013

For 451 Research clients: ScaleBase provides centralized management of distributed MySQL databases bit.ly/YcQTcs

— Matt Aslett (@maslett) February 13, 2013

For 451 Research clients: XtremeData turns its attention to cloud-based data warehousing bit.ly/XB7MLY

— Matt Aslett (@maslett) …

[Read more]
451 CAOS Links 2011.12.09

Funding for BlazeMeter and Digital Reasoning. Red Hat goes unstructured. And more.

# BlazeMeter announced $1.2m in Series A funding and launched the a cloud service for load and performance testing.

# Digital Reasoning announced a second round of funding to help develop its Hadoop-based analytics offering.

# Red Hat announced the availability of Red Hat Storage Software Appliance, based on its recent acquisition of Gluster.

# Red Hat also …

[Read more]
451 CAOS Links. 2011.12.02

Talend delivers v5. Zentyal raises series A. The TCO of OSS. And more.

# Talend announced version 5 of its data integration suite, adding business process management capabilities via an OEM relationship with BonitaSoft. Yves De Montcheuil explained the name changes in version 5.

# Zentyal closed a series A venture capital funding of over $1m by Open Ocean …

[Read more]
451 CAOS Links 2011.07.01

A herd of Hadoop announcements. Rockmelt raises $30m. And more.

A herd of Hadoop announcements
# Yahoo! and Benchmark Capital confirmed the formation of Hortonworks, an independent company focused on the development and support of Apache Hadoop.

# Cloudera announced the availability of Cloudera Enterprise 3.5 and the launch of Cloudera SCM Express, based on the new Service and Configuration Manager in Cloudera Enterprise 3.5.

# MapR …

[Read more]
451 CAOS Links 2011.06.10

Yet more Apache OpenOffice fall-out. Bacula Systems raises $5m. And more.

# As the proposal to incubate OpenOffice.org at Apache went live, controversy about the proposal continued. The Free Software Foundation unsurprisingly voiced its support in favour of the LGPL LibreOffice project,which Keith Curtis outlined his opposition to the plan.

# Bacula Systems raised $5m from KM Capital Partners and from the Swiss Canton of Vaud.

# Joe Brockmeier …

[Read more]
451 CAOS Links 2011.05.10

EMC launches Greenplum HD. DataStax releases Brisk. And more.

# EMC launched its Greenplum HD Hadoop distribution, with the support of Jaspersoft, Pentaho, and SnapLogic, among others.

# DataStax …

[Read more]
451 CAOS Links 2011.01.25

VMware grows 41%. Evidence of Java infringement disputed. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# VMware announced full year revenue growth of 41% to $2.9bn.

# Alleged evidence of infringing Java code in Android disputed.

# Oracle nominated SouJava, the Brazilian Java User Group, to a seat in the JCP Executive Committee.

# The Document Foundation launched LibreOffice 3.3. …

[Read more]
451 CAOS Links 2011.01.18

Funding for OpenGamma. Riptano becomes OpenStax. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# OpenGamma raised $6m series B funding.

# Apache Cassandra-supporter Riptano changed its name to DataStax and has added 50 customers in 6 months.

# WANdisco acquired the SVNForum.org Subversion user community.

# Univa hired the principal engineers from the Grid Engine team, will publish a …

[Read more]
451 CAOS Links 2010.12.17

CPTN Holdings unmasked. Oracle updates MySQL. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# Florian Mueller reported that the Novell patent acquiring CPTN Holdings is Microsoft, Apple, EMC and Oracle.

# The VAR Guy told the (previously) untold story of Novell’s sale to Attachmate.

# Attachmate committed to support the existing roadmaps and release schedules for Novell and SUSE products.

# Oracle …

[Read more]
Showing entries 1 to 10 of 31
10 Older Entries »