Planet MySQL

Displaying posts with tag: Hive (reset)

Aug

2018

Databook: Turning Big Data into Knowledge with Metadata at Uber

Posted by Uber Engineering on Fri 03 Aug 2018 15:30 UTC
Tags:

postgres, Infrastructure, metadata, Architecture, data warehouse, Data Management, vertica, cassandra, quartz, Hive, MySQL, hdfs, Kafka, gradle, Uber, Uber Data, Data Storage, Databook, Dropwizard, Queryparser, RESTful API, Uber Data Knowledge, Uber Engineering

From driver and rider locations and destinations, to restaurant orders and payment transactions, every interaction on Uber’s transportation platform is driven by data. Data powers Uber’s global marketplace, enabling more reliable and seamless user experiences across our products for riders, …

The post Databook: Turning Big Data into Knowledge with Metadata at Uber appeared first on Uber Engineering Blog.

May

2017

Will SQL just die already?

Posted by Sean Hull on Fri 05 May 2017 18:48 UTC
Tags:

postgres, Oracle, sql, data, hadoop, Database Management, All, Hive, MySQL, Database Operations, redshift, bigquery

With tons of new No-SQL database offerings everyday, developers & architects have a lot of options. Cassandra, Mongodb, Couchdb, Dynamodb & Firebase to name a few. Join 33,000 others and follow Sean Hull on twitter @hullsean. What’s more in the data warehouse space, you have Hadoop, which can churn through terabytes of data and get … Continue reading Will SQL just die already? →

Oct

2016

Designing Euclid to Make Uber Engineering Marketing Savvy

Posted by Uber Engineering on Wed 19 Oct 2016 16:05 UTC
Tags:

marketing, api, hadoop, Hive, MySQL, spark, General Engineering

Fast, granular, reliable ROI on ad performance was our bugle call to build Euclid, Uber’s in-house marketing platform. Early this year, Euclid replaced a legacy system, which processed ROI data somewhat manually as it struggled to keep up with Uber’s …

The post Designing Euclid to Make Uber Engineering Marketing Savvy appeared first on Uber Engineering Blog.

Jul

2016

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

Posted by Uber Engineering on Thu 21 Jul 2016 16:09 UTC
Tags:

Open Source, javascript, database, Python, mobile, data, hadoop, MapReduce, git, big data, soa, cassandra, go, riak, Hive, node.js, MySQL, node, elasticsearch, Kafka, flask, General Engineering, d3.js, IPython, Jupyter, Mapbox, Marketplace, NPM, React, Ringpop, Uber Data, UberEATS, UberRUSH

Uber Engineering

Uber’s mission is transportation as reliable as running water, everywhere, for everyone. Last time, we talked about the foundation that powers Uber Engineering. Now, we’ll explore the parts of the stack that face riders and drivers, starting …

The post The Uber Engineering Tech Stack, Part II: The Edge and Beyond appeared first on Uber Engineering Blog.

Jan

2016

How to Deploy a Cluster

Posted by Valerie Parham-Thompson of The Pythian Group on Tue 05 Jan 2016 18:15 UTC
Tags:

cluster, hadoop, cloudera, big data, Hive, Technical Track, co-op

In this blog post I will talk about how to deploy a cluster, the methods I tried and my solution to resolving the prerequisites problem.

I’m fairly new to the big data field. Learning about Hadoop, I kept hearing the term “clusters”, deploying a cluster, and installing some services on namenode, some on datanode and so on. I also heard about Cloudera manager which helps me to deploy services on my cluster, so I set up a VM and followed several tutorials including the Cloudera documentation to install cloudera manager. However, every time I reached the “cluster installation” step my installation failed. I later found out that there are several prerequisites for a Cloudera Manager Installation, which was the reason for the failure to install.

Deploy a Cluster

Though I discuss 3 other methods in detail, ultimately I recommend method …

[Read more]

Jun

2015

Log Buffer #428: A Carnival of the Vanities for DBAs

Posted by The Pythian Group on Mon 22 Jun 2015 17:45 UTC
Tags:

Oracle, dns, Log Buffer, Pythian, SQL Server, syntax, mariadb, flushing, pivot, Hive, GoldenGate, MySQL, SSIS, SSRS

The Log Buffer Edition once again is sparkling with some gems, hand-picked from Oracle, SQL Server and MySQL.

Oracle:

Oracle GoldenGate 12.1.2.1.1 is now certified with Unity 14.10. With this certification, customers can use Oracle GoldenGate to deliver data to Teradata Unity which can then automate the distribution of data to multiple Teradata databases.
How do I change DNS servers on Exadata storage servers.
Flushing Shared Pool Does Not Slow Its Growth.
…

[Read more]

Apr

2014

Using Apache Hadoop and Impala together with MySQL for data analysis

Posted by Alexander Rubin of MySQL Performance Blog on Mon 21 Apr 2014 13:43 UTC
Tags:

scalability, hadoop, Hive, MySQL, Performance, Impala, Data Science

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting on top of that. For the examples below I will use the “ontime flight performance” data from my previous post (Increasing MySQL performance with parallel query execution). I’ve used the Cloudera Manager v.4 to install Apache Hadoop and Impala. For this test …

[Read more]

Sep

2013

Percona Live London 2013: an insider’s view of the schedule

Posted by MySQL Performance Blog on Wed 18 Sep 2013 05:00 UTC
Tags:

Benchmarks, hadoop, Hive, percona live, Events and Announcements, Hardware and Storage, Insight for DBAs, Insight for Developers, MySQL, PerconaLive, Percona XtraBackup, PLUK, Percona Live London 2013, PLUK13

With the close of call for papers earlier this month, the Percona Live London conference committee was in full swing this past week reviewing all of the many submissions for November’s Percona Live London MySQL Conference.

The submissions are far ranging and cover some really interesting topics, making the lineup for Percona Live London really strong! What the committee looks for in a submission is how much “value” a talk will bring to the conference – this is to say it needs to be far more that a product demo. As such, real-world experiences are receiving much more favorable reviews, along with talks that cover methodologies the attendees will …

[Read more]

Aug

2013

Big Data with MySQL and Hadoop at MySQL Connect 2013

Posted by Alexander Rubin of MySQL Performance Blog on Thu 08 Aug 2013 10:00 UTC
Tags:

hadoop, big data, sqoop, Hive, MySQL, flume, MySQL Connect 2013, Alexander Rubin

I will be talking about Big Data with MySQL and Hadoop at MySQL Connect 2013 (Sept. 21-22) in San Francisco as well as at Percona University at Washington, DC (September 12, 2013). Apache Hadoop is a very popular Big Data solution and we can nowadays easily integrate it with MySQL. I will start with a brief introduction of Apache Hadoop and its components (HFDS, Map/Reduce, Hive, HBase/HCatalog, Flume, Scoop, etc). Next I will show 2 major Big Data scenarios:

From file to Hadoop to MySQL. This is an example of “ELT” process: Extract data from external source; Load data into Hadoop; Transform data/Analyze data; Extract results to MySQL. It is similar to the original Data Warehouse ETL …

[Read more]

Jul

2013

MySQL and Hadoop integration

Posted by Alexander Rubin of MySQL Performance Blog on Thu 11 Jul 2013 10:00 UTC
Tags:

hadoop, sqoop, Hive, Insight for DBAs, MySQL, Apache Hadoop, Data Science, no sql

Dolphin and Elephant: an Introduction

This post is intended for MySQL DBAs or Sysadmins who need to start using Apache Hadoop and want to integrate those 2 solutions. In this post I will cover some basic information about the Hadoop, focusing on Hive as well as MySQL and Hadoop/Hive integration.

First of all, if you were dealing with MySQL or any other relational database most of your professional life (like I was), Hadoop may look different. Very different. Apparently, Hadoop is the opposite to any relational database. Unlike the database where we have a set of tables and indexes, Hadoop works with a set of text files. And… there are no indexes at all. And yes, this may be shocking, but all scans are sequential (full “table” scans in MySQL terms).

So, when does Hadoop makes sense?

First, Hadoop is great if you need to …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links