Showing entries 81 to 90 of 164
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: hadoop (reset)
Data Science vs. Data Analytics

As this topic came up a few times this week for discussion at various places, I thought of composing a post on “Data Scientist vs. Data Analytics Engineer”; even though[...]

On Big Data, Analytics and Hadoop. Interview with Daniel Abadi.

“Some people even think that “Hadoop” and “Big Data” are synonymous (though this is an over-characterization). Unfortunately, Hadoop was designed based on a paper by Google in 2004 which was focused on use cases involving unstructured data (e.g. extracting words and phrases from Webpages in order to create Google’s Web index). Since it was not [...]

Distributed Clustering Services

Apart from my consulting as part of ScaleIn, I also invest to bootstrap companies with really disruptive ideas; and in the process met few database specific companies who are already[...]

Typical “Big” Data Architecture

Here is the typical “Big” data architecture, that covers most components involved in the data pipeline. More or less, we have the same architecture in production in number of places[...]

MySQL and Hadoop Integration - Unlocking New Insight

“Big Data” offers the potential for organizations to revolutionize their operations. With the volume of business data doubling every 1.2 years, analysts and business users are discovering very real benefits when integrating and analyzing data from multiple sources, enabling deeper insight into their customers, partners, and business processes.

As the world’s most popular open source database, and the most deployed database in the web and cloud, MySQL is a key component of many big data platforms, with Hadoop vendors estimating 80% of deployments are integrated with MySQL.

The new Guide to MySQL and Hadoop presents the tools enabling integration between the two data platforms, supporting the data lifecycle from acquisition and organisation to …

[Read more]
Two Cons against NoSQL. Part I.

Two cons against NoSQL data stores read like this: 1. It’s very hard to move data out from one NoSQL to some other system, even other NoSQL. There is a very hard lock in when it comes to NoSQL. If you ever have to move to another database, you have basically to re-implement a lot [...]

Log Buffer #279, A Carnival of the Vanities for DBAs

In a typical organization, all work together to bring out a common good for the outside world. It’s interesting to see how all of these entities blog about technology, and there is more and more interest shown by managerial technologists about the database. This Log Buffer Edition appeases their appetites along with the others in [...]

MySQL and Hadoop

Introduction

"Improving MySQL performance using Hadoop" was the talk which I and Manish Kumar gave at Java One & Oracle Develop 2012, India. Based on the response and interest of the audience, we decided to summarize the talk in a blog post. The slides of this talk can be found here. They also include a screen-cast of a live Hadoop system pulling data from MySQL and working on the popular 'word count' problem.



MySQL and Hadoop have been popularly considered as 'Friends with benefits' and our talk was aimed at showing how!

The benefits of MySQL to developers are the speed, reliability, data integrity and …

[Read more]
So now Hadoop's days are numbered?

Earlier this week we all read GigaOM's article with this title:
"Why the days are numbered for Hadoop as we know it"I know GigaOM like to provoke scandals sometimes, we all remember some other unforgettable piece, but there is something behind it...

Hadoop today (after SOA not so long ago) is one of the worst case of an abused buzzword ever known to men. It's everything, everywhere, can cure illnesses and do "big-data" at the same time! Wow! Actually Hadoop is a software framework that supports data-intensive distributed applications, derived from Google's MapReduce and Google File System (GFS) papers.

My take from the article is this: Hadoop is a foundation, low-level platform. I used the word …

[Read more]
A super-set of MySQL for Big Data. Interview with John Busch, Schooner.

“Legacy MySQL does not scale well on a single node, which forces granular sharding and explicit application code changes to make them sharding-aware and results in low utilization of severs”– Dr. John Busch, Schooner Information Technology A super-set of MySQL suitable for Big Data? On this subject, I have interviewed Dr. John Busch, Founder, Chairman, [...]

Showing entries 81 to 90 of 164
« 10 Newer Entries | 10 Older Entries »