Planet MySQL

Displaying posts with tag: hadoop (reset)

Oct

2018

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Posted by Uber Engineering on Wed 17 Oct 2018 16:00 UTC
Tags:

Apache, engineering, storage, Architecture, hadoop, data warehouse, big data, json, MySQL, Data Modeling, latency, Apache Hadoop, Docker, Apache Spark, Uber Data, PostgresSQL, hoodie, Apache Parquet, Hudi, Uber Eng

Uber is committed to delivering safer and more reliable transportation across our global markets. To accomplish this, Uber relies heavily on making data-driven decisions at every level, from forecasting rider demand during high traffic events to identifying and addressing bottlenecks…

The post Uber’s Big Data Platform: 100+ Petabytes with Minute Latency appeared first on Uber Engineering Blog.

May

2018

Don’t Drown in your Data Lake

Posted by MySQL Performance Blog on Thu 31 May 2018 17:37 UTC
Tags:

hadoop, big data, MySQL, open source databases, data lake

A data lake is “…a method of storing data within a system or repository, in its natural format, that facilitates the collocation of data in various schemata and structural forms…”1. Many companies find value in using a data lake but aren’t clear that they need to properly plan for it and maintain it in order to prevent issues.

The idea of a data lake rose from the need to store data in a raw format that is accessible to a variety of applications and authorized users. Hadoop is often used to query the data, and the necessary structures for querying are created through the query tool (schema on read) rather than as part of the data design (schema on write). There are other tools available for analysis, and many cloud providers are actively developing additional options for creating …

[Read more]

May

2018

MariaDB to Hadoop in Spanish

Posted by MC Brown on Wed 16 May 2018 08:12 UTC
Tags:

Replication, Databases, Random, Commentary, hadoop, MySQL, tungsten-replicator

Nicolas Tobias has written an awesome guide to setting up replication from MariaDB to Hadoop/HDFS using Tungsten Replicator, in Spanish! He’s planning more of these so if you like what you see, please let him know!

Semana santa y yo con nuevas batallas que contar.
Me hayaba yo en el trabajo, pensando en que iba a invertir la calma que acompa;a a los dias de vacaciones que libremente podemos elegir trabajar y pense: No seria bueno terminar esa sincronizacion entre los servidores de mariaDB y HIVE?

Ya habia buscado algo de info al respecto en Enero hasta tenia una PoC montada con unas VM que volvi a encender, pero estaba todo podrido: no arrancaba, no funcionba ni siquiera me acordaba como lo habia hecho y el history de la shell er un galimatias. Decidi que si lo rehacia todo desde cero iba a poder dejarlo escrito en un playbook y ademas, aprenderlo y automatizarlo hasta el limite de poder desplegar de forma automatica on …

[Read more]

Jun

2017

On Apache Ignite, Apache Spark and MySQL. Interview with Nikita Ivanov

Posted by Roberto V. Zicari on Fri 30 Jun 2017 13:40 UTC
Tags:

Uncategorized, sql, memcached, data warehousing, analytics, hadoop, mysq, Gridgain, SaaS, big data, vertica, redis, internet of things, machine learning, Tableau, Apache Ignite, Nikita Ivanov, proxysql, Apache Spark, vitess, ClickHouse, Apache Ignite In-Memory SQL Grid, Apache Kafka, ETL processes, in-memory computing, in-memory data grids, Spark Streaming

“Spark and Ignite can complement each other very well. Ignite can provide shared storage for Spark so state can be passed from one Spark application or job to another. Ignite can also be used to provide distributed SQL with indexing that accelerates Spark SQL by up to 1,000x.”–Nikita Ivanov.

I have interviewed Nikita Ivanov,CTO of GridGain.
Main topics of the interview are Apache Ignite, Apache Spark and MySQL, and how well they perform on big data analytics.

RVZ

Q1. What are the main technical challenges of SaaS development projects?

Nikita Ivanov: SaaS requires that the applications be highly responsive, reliable and web-scale. SaaS development projects face many of the same challenges as …

[Read more]

May

2017

A roughneck walk down database alley

Posted by Sean Hull on Thu 11 May 2017 21:39 UTC
Tags:

sql, data, memcache, hadoop, Database Management, NoSQL, redis, devops, relational, MySQL, redshift, biodata, emr

via GIPHY I was just responding to some Disqus comments on a recent blog post. Admittedly it had a provocative title Will SQL databases just die already. What do you think? Join 34,000 others and follow Sean Hull on twitter @hullsean. A reader pointed out that some No-SQL databases do support joins. Huh? My face … Continue reading A roughneck walk down database alley →

May

2017

Will SQL just die already?

Posted by Sean Hull on Fri 05 May 2017 18:48 UTC
Tags:

postgres, Oracle, sql, data, hadoop, Database Management, All, Hive, MySQL, Database Operations, redshift, bigquery

With tons of new No-SQL database offerings everyday, developers & architects have a lot of options. Cassandra, Mongodb, Couchdb, Dynamodb & Firebase to name a few. Join 33,000 others and follow Sean Hull on twitter @hullsean. What’s more in the data warehouse space, you have Hadoop, which can churn through terabytes of data and get … Continue reading Will SQL just die already? →

Oct

2016

HopsFS based on MySQL Cluster 7.5 delivers a scalable HDFS

Posted by Mikael Ronström on Wed 19 Oct 2016 23:15 UTC
Tags:

MySQL Cluster, ndb, hadoop, hdfs, MySQL Cluster 7.5, HopsFS

The swedish research institute, SICS, have worked hard for a few years on
developing a scalable and a highly available Hadoop implementation using
MySQL Cluster to store the metadata. In particular they have focused on the
Hadoop file system (HDFS) and the YARN. Using features of MySQL
Cluster 7.5 they were able to achieve linear scaling in number of name
nodes as well as in number of NDB data nodes to the number of nodes
available for the experiment (72 machines). Read the press release from
SICS here

The existing metadata layer of HDFS is based on a single Java server
that acts as name node in HDFS. There are implementations to ensure
that this metadata layer have HA by using a backup name node and to
use ZooKeeper for heartbeats and a number of …

[Read more]

Oct

2016

Designing Euclid to Make Uber Engineering Marketing Savvy

Posted by Uber Engineering on Wed 19 Oct 2016 16:05 UTC
Tags:

marketing, api, hadoop, Hive, MySQL, spark, General Engineering

Fast, granular, reliable ROI on ad performance was our bugle call to build Euclid, Uber’s in-house marketing platform. Early this year, Euclid replaced a legacy system, which processed ROI data somewhat manually as it struggled to keep up with Uber’s …

The post Designing Euclid to Make Uber Engineering Marketing Savvy appeared first on Uber Engineering Blog.

Jul

2016

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

Posted by Uber Engineering on Thu 21 Jul 2016 16:09 UTC
Tags:

Open Source, javascript, database, Python, mobile, data, hadoop, MapReduce, git, big data, soa, cassandra, go, riak, Hive, node.js, MySQL, node, elasticsearch, Kafka, flask, General Engineering, d3.js, IPython, Jupyter, Mapbox, Marketplace, NPM, React, Ringpop, Uber Data, UberEATS, UberRUSH

Uber Engineering

Uber’s mission is transportation as reliable as running water, everywhere, for everyone. Last time, we talked about the foundation that powers Uber Engineering. Now, we’ll explore the parts of the stack that face riders and drivers, starting …

The post The Uber Engineering Tech Stack, Part II: The Edge and Beyond appeared first on Uber Engineering Blog.

Jul

2016

Eight Ways To Ensure Your Applications Are Enterprise-Ready

Posted by The Pythian Group on Tue 12 Jul 2016 15:00 UTC
Tags:

Oracle, Open Source, hadoop, NoSQL, MySQL, Technical Track, Business Insights

When it comes to building database applications and solutions, developers, DBAs, engineers and architects have a lot of new and exciting tools and technologies to play with, especially with the Hadoop and NoSQL environments growing so rapidly.

While it’s easy to geek out about these cool and revolutionary new technologies, at some point in the development cycle you’ll need to stop to consider the real-world business implications of the application you’re proposing. After all, you’re bound to face some tough questions, like:

Why did you choose that particular database for our mission-critical application? Can your team provide 24/7 support for the app? Do you have a plan to train people on this new technology? Do we have the right hardware infrastructure to support the app’s deployment? How are you going to ensure there won’t be any bugs or security vulnerabilities?

If you don’t have a plan for …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links