Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 19

Displaying posts with tag: MapReduce (reset)

Data Analytics at NBCUniversal. Interview with Matthew Eric Bassett.
+0 Vote Up -0Vote Down
“The most valuable thing I’ve learned in this role is that judicious use of a little bit of knowledge can go a long way. I’ve seen colleagues and other companies get caught up in the “Big Data” craze by spend hundreds of thousands of pounds sterling on a Hadoop cluster that sees a few megabytes [...]
On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.
+0 Vote Up -0Vote Down
“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]
Typical “Big” Data Architecture
+1 Vote Up -0Vote Down
Here is the typical “Big” data architecture, that covers most components involved in the data pipeline. More or less, we have the same architecture in production in number of places[...]
MySQL and Hadoop
Employee_Team +2 Vote Up -0Vote Down

Introduction

"Improving MySQL performance using Hadoop" was the talk which I and Manish Kumar gave at Java One & Oracle Develop 2012, India. Based on the response and interest of the audience, we decided to summarize the talk in a blog post. The slides of this talk can be found here. They also include a screen-cast of a live Hadoop system pulling data from MySQL and working on the popular 'word count' problem.



MySQL and Hadoop have been popularly considered as 'Friends with benefits' and our talk was aimed at showing how!



  [Read more...]
A super-set of MySQL for Big Data. Interview with John Busch, Schooner.
+0 Vote Up -0Vote Down
“Legacy MySQL does not scale well on a single node, which forces granular sharding and explicit application code changes to make them sharding-aware and results in low utilization of severs”– Dr. John Busch, Schooner Information Technology A super-set of MySQL suitable for Big Data? On this subject, I have interviewed Dr. John Busch, Founder, Chairman, [...]
451 CAOS Links 2011.08.23
+0 Vote Up -0Vote Down

Engine Yard acquires Orchestra. Red Hat considers NoSQL move. And more.

# Engine Yard announced a definitive agreement to acquire Orchestra, bringing PHP expertise to the Engine Yard platform.

# Red Hat’s CEO indicated the company is interested in a NoSQL or Hadoop acquisition.

# Gluster announced Apache Hadoop compatibility in the next GlusterFS release.

# Microsoft signed an agreement with China Standard Software Co (CS2C) to support CS2C

  [Read more...]
451 CAOS Links 2011.07.01
+0 Vote Up -0Vote Down

A herd of Hadoop announcements. Rockmelt raises $30m. And more.

A herd of Hadoop announcements
# Yahoo! and Benchmark Capital confirmed the formation of Hortonworks, an independent company focused on the development and support of Apache Hadoop.

# Cloudera announced the availability of Cloudera Enterprise 3.5 and the launch of Cloudera SCM Express, based on the new Service and Configuration Manager in Cloudera Enterprise 3.5.

# MapR


  [Read more...]
451 CAOS Links 2010.10.08
+0 Vote Up -0Vote Down

Patents! Patents! Patents! Canonical’s perfect 10. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# Google responded to Oracle’s claims that its Android OS infringes copyrights and patents related to Java.

# Matt Asay evaluated the various patent claims against Android and its related devices.

# Microsoft licensed smartphone patents from ACCESS Co and a subsidiary of Acacia Research.

# Glyn Moody assessed what Microsoft’s


  [Read more...]
The SMAQ stack for big data
+0 Vote Up -0Vote Down

SMAQ report sections

→ MapReduce

→ Storage

→ Query

→ Conclusion

"Big data" is data that becomes large enough that it cannot be processed using conventional methods. Creators of web search engines were among the first to confront this problem. Today, social networks, mobile phones, sensors and science contribute to petabytes of data created daily.

To meet the challenge of processing such large data sets, Google created MapReduce. Google's work and Yahoo's creation of the Hadoop MapReduce implementation has spawned an ecosystem of big data processing tools.

As MapReduce has grown in  [Read more...]

MapReduce – DBInputFormat – Serialization on readers
+1 Vote Up -0Vote Down
Last week I was working on EC2 MySQL server where one of the slave is taking lot of time to catch-up; and only job that is running on that server[...]
451 CAOS Links 2010.04.27
+0 Vote Up -0Vote Down

VMware and Salesforce.com launch VMforce. Red Hat provides Cloud Access. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

# VMware and Salesforce.com launched VMforce, a platform for developing and deploying Java cloud applications.

# Red Hat Cloud Access enables enterprises to use their Red Hat Enterprise Linux subscription on Amazon Web Services.

# Canonical announced Ubuntu 10.04 LTS Server Edition, Desktop Edition and ISV support.

# Novell


  [Read more...]
451 CAOS Links 2010.01.21
+1 Vote Up -0Vote Down

EC approves Oracle-Sun. Google patents MapReduce. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

EC approves Oracle-Sun

The European Commission cleared Oracle’s proposed acquisition of Sun Microsystems. While Larry Ellison is set to unveil Oracle’s Sun strategy on January 27th, Monty Widenius said he will go to the Court of First Instance to appeal the decision.

# Pro-open source political party formed in Hungary.

# Google patented MapReduce,





  [Read more...]
VMware,”Hey what ya’ building over there?”
+0 Vote Up -0Vote Down

Today I caught a tweet from Kara Swisher referencing some exclusive news she posted on Boomtown about VMware’s upcoming deal to buy Zimbra from Yahoo! This is would be VMware’s second acquisition of an open source ISV in under a

  [Read more...]
451 CAOS Links 2009.07.21
+0 Vote Up -0Vote Down

Microsoft contributes to Linux. Acquia raises $8m. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
“Tracking the open source news wires, so you don’t have to.”

Microsoft contributes to Linux
Microsoft announced that it is to contribute device driver code to the Linux kernel under the GPLv2. Prompting us to publish a CAOS Theory Q&A. Answering one questioning we failed to ask, ZDnet reported that Microsoft’s Linux contributions should find their way into the 2.6.32 release.

Acquia raises $8m
Mass High Tech reported Acquia has picked up



  [Read more...]
Is ScaleDB Using MapReduce? Competing with Hadoop?
+1 Vote Up -0Vote Down

I’ve had a few VCs ask how we compare to Hadoop and companies using MapReduce. With Google blessing MapReduce, it seems to be the cool new thing. I figure I’m going to have to explain this to VCs, so I might as well blog about it.

MapReduce is a process of dividing a problem into small pieces and distributing (mapping) those pieces to a large number of computers. Then it collects the processed data and merges (reduces) it into a result set. Hadoop provides the plumbing, so users focus on writing the query and Hadoop handles the dirty work of mapping and reducing. Such a query, using a procedural language like Java, is more complex than a comparable SQL query, but more on that below.

So what is MapReduce good for? It really shines when you want to summarize, analyze or transform a very large data set. This is why it is well suited to web data. Map reduce

  [Read more...]
Is ScaleDB Using MapReduce? Competing with Hadoop?
+0 Vote Up -0Vote Down

I’ve had a few VCs ask how we compare to Hadoop and companies using MapReduce. With Google blessing MapReduce, it seems to be the cool new thing. I figure I’m going to have to explain this to VCs, so I might as well blog about it.

MapReduce is a process of dividing a problem into small pieces and distributing (mapping) those pieces to a large number of computers. Then it collects the processed data and merges (reduces) it into a result set. Hadoop provides the plumbing, so users focus on writing the query and Hadoop handles the dirty work of mapping and reducing. Such a query, using a procedural language like Java, is more complex than a comparable SQL query, but more on that below.

So what is MapReduce good for? It really shines when you want to summarize, analyze or transform a very large data set. This is why it is well suited to web data. Map reduce

  [Read more...]
PDI cloud : massive performance roundup
+0 Vote Up -0Vote Down

Dear Kettle fans,

As expected there was a lot of interest in cloud computing at the MySQL conference last week.  It felt really good to be able to pass the Bayon Technologies white paper around to friends, contacts and analysts.  It’s one thing to demonstrate a certain scalability on your blog, it’s another entirely to have a smart man like Nicholas Goodman do the math.

Sorting massive amounts of rows is hard problem to take on.  Making it scale on low-cost EC2

  [Read more...]
Could Google be stymied by a lack of openness?
+0 Vote Up -0Vote Down

It seems almost churlish to wonder whether Google could be even more successful than it already is with a different strategy, but the company’s approach to open source and open development has come into focus in recent weeks.

On last week’s podcast we discussed whether the company should see the AGPL as more of an opportunity than a threat following Jay’s post about the company releasing more code under open source licenses.

Nik Cubrilovic over at TechCrunch, meanwhile, has written an interesting

  [Read more...]
Summary of beCamp 2008
+0 Vote Up -0Vote Down

Yesterday I went to beCamp 2008 along with four roomfuls of other people interested in technology (perhaps close to 100 people total). The conference was a lot of fun. Not everything went as planned, but that was as planned. This was an Open Spaces conference and I thought it worked very well. From an email Eric Pugh sent:

Basically it all boils down to:

Open Space is the Law of Two Feet: if anyone finds themselves in a place where they are neither learning nor contributing they should move to somewhere more productive. And from the law flow four principles:

  • Whoever comes are the right people
  • Whatever happens is the only thing that could have
  • Whenever it starts
  [Read more...]
Showing entries 1 to 19

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.