Showing entries 1 to 3
Displaying posts with tag: Pig (reset)
Rosetta Stone: MySQL, Pig and Spark (Basics)

In a world where new data processing languages appear every day, it can be helpful to have tutorials explaining language characteristics in detail from the ground up.  This blog post is not such a tutorial.   It also isn’t a tutorial on getting started with MySQL or Hadoop, nor is it a list of best practices for the various languages I’ll reference here – there are bound to be better ways to accomplish certain tasks, and where a choice was required, I’ve emphasized clarity and readability over performance.  Finally, this isn’t meant to be a quickstart for SQL experts to access Hadoop – there are a number of SQL interfaces to Hadoop such as Impala or Hive that make Hadoop incredibly accessible to those with existing SQL skills.

Instead, this post is a pale equivalent of the …

[Read more]
HPCC vs Hadoop at a glance


Since this article was written, HPCC has undergone a number of significant changes and updates. This addresses some of the critique voiced in this blog post, such as the license (updated from AGPL to Apache 2.0) and integration with other tools. For more information, refer to the comments placed by Flavio Villanustre and Azana Baksh.

The original article can be read unaltered below:

Yesterday I noticed this tweet by Andrei Savu: . This prompted me to read the related GigaOM article and then check out the HPCC Systems …

[Read more]
451 CAOS Links 2010.03.02

Novell’s Q1. The future of OpenSolaris. And more.

Follow 451 CAOS Links live @caostheory on Twitter and

“Tracking the open source news wires, so you don’t have to.”

# Novell reported Linux platform revenue of $37.5m in Q1, up 6.4%.

# reported that Novell’s Linux business broke even as Microsoft deal revenues fade.

# As the H reported Oracle exec Dan Roberts confirmed that OpenSolaris has a future at Oracle.

# Citrix acquired Paglo, launched GoToManage service.

# StatusNet …

[Read more]
Showing entries 1 to 3