Planet MySQL

Displaying posts with tag: big data (reset)

Apr

2017

Percona Live 2017: Day Two Keynotes

Posted by MySQL Performance Blog on Wed 26 Apr 2017 19:46 UTC
Tags:

Oracle, innodb, big data, percona live, Insight for DBAs, Insight for Developers, MySQL, VividCortex, MySQL 8.0, Percona Live Keynotes

Welcome to the second day of the Percona Live Open Source Database Conference 2017, and the second set of Percona Live keynotes! It’s a bit rainy outside today, but that isn’t bothering the Percona Live attendees (we’re all indoors learning about new open source technologies)!

Day two of the conference kicked off with another four keynote talks, all of which discussed issues and technologies that are addressed by open source solutions:

The Open Source Database Business Model is Under Siege

Paul Dix (InfluxData)

Paul Dix’s keynote may have ruffled a few feathers, as he looked at possible futures for the open …

[Read more]

Apr

2017

Percona Live 2017 Tutorials Day

Posted by MySQL Performance Blog on Tue 25 Apr 2017 05:38 UTC
Tags:

innodb, Benchmarks, Tutorials, big data, mariadb, mongodb, json, percona live, Cloud and NoSQL, MySQL, Cloud and MySQL, MyRocks

Welcome to the first day of the Percona Live Open Source Database Conference: Percona Live 2017 tutorials day! While technically the first day of the conference, this day focused on provided hands-on tutorials for people interested in learning directly how to use open source tools and technologies.

Today attendees went to training sessions taught by open source database experts and got first-hand experience configuring, working with, and experimenting with various open source technologies and software.

The first full day (which includes opening keynote speakers and breakout sessions) starts Tuesday 4/25 at 9:00 am.

Some of the …

[Read more]

Mar

2017

Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs. Apache Spark

Posted by Alexander Rubin of MySQL Performance Blog on Fri 17 Mar 2017 18:12 UTC
Tags:

benchmark, column store, big data, MySQL, Apache Spark, Column Store Database, ClickHouse, MariaDB ColumnStore

This blog shares some column store database benchmark results, and compares the query performance of MariaDB ColumnStore v. 1.0.7 (based on InfiniDB), Clickhouse and Apache Spark.

I’ve already written about ClickHouse (Column Store database).

The purpose of the benchmark is to see how these three solutions work on a single big server, with many CPU cores and large amounts of RAM. Both systems are massively parallel (MPP) database systems, so they should use many cores for SELECT queries.

For the benchmarks, I chose …

[Read more]

Aug

2016

Database Challenges and Innovations. Interview with Jim Starkey

Posted by Roberto V. Zicari on Wed 31 Aug 2016 03:33 UTC
Tags:

Open Source, Uncategorized, sql, Google, RDBMS, Jim Starkey, big data, NoSQL, MySQL, nosql databases, NuoDB, New and old Data stores, relational model, codd, AmorphousDB, distributed database systems

“Isn’t it ironic that in 2016 a non-skilled user can find a web page from Google’s untold petabytes of data in millisecond time, but a highly trained SQL expert can’t do the same thing in a relational database one billionth the size?.–Jim Starkey.

I have interviewed Jim Starkey. A database legend, Jim’s career as an entrepreneur, architect, and innovator spans more than three decades of database history.

RVZ

Q1. In your opinion, what are the most significant advances in databases in the last few years?

Jim Starkey: I’d have to say the “atom programming model” where a database is layered on a substrate of peer-to-peer replicating distributed objects rather than disk files. The atom programming model enables scalability, redundancy, high availability, and distribution not available in traditional, disk-based database …

[Read more]

Aug

2016

LinkedIn China new Social Platform Chitu. Interview with Dong Bin.

Posted by Roberto V. Zicari on Thu 04 Aug 2016 19:27 UTC
Tags:

Oracle, Uncategorized, analytics, big data, neo4j, MySQL, relational databases, OrientDB, Chitu, Dong Bin, Graph Databases, Liepin, LinkedIn China, Maimai

“Complicated queries, like looking for second degree friends, is really hard to traditional databases.” –Dong Bin

I have interviewed Dong Bin, Engineer Manager at LinkedIn China. The LinkedIn China development team launched a new social platform — known as Chitu — to attract a meaningful segment of the Chinese professional networking market.

RVZ

Q1. What is your role at LinkedIn China?

Dong Bin: I am an Engineer Manager in charge of the backend services for Chitu. The backend includes all Chitu`s consumer based features, like feeds, chat, event, etc.

Q2. You recently launched a new social platform, called Chitu. Which segment of the …

[Read more]

Jul

2016

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

Posted by Uber Engineering on Thu 21 Jul 2016 16:09 UTC
Tags:

Open Source, javascript, database, Python, mobile, data, hadoop, MapReduce, git, big data, soa, cassandra, go, riak, Hive, node.js, MySQL, node, elasticsearch, Kafka, flask, General Engineering, d3.js, IPython, Jupyter, Mapbox, Marketplace, NPM, React, Ringpop, Uber Data, UberEATS, UberRUSH

Uber Engineering

Uber’s mission is transportation as reliable as running water, everywhere, for everyone. Last time, we talked about the foundation that powers Uber Engineering. Now, we’ll explore the parts of the stack that face riders and drivers, starting …

The post The Uber Engineering Tech Stack, Part II: The Edge and Beyond appeared first on Uber Engineering Blog.

Feb

2016

A Grand Tour of Big Data. Interview with Alan Morrison

Posted by Roberto V. Zicari on Thu 25 Feb 2016 15:52 UTC
Tags:

Uncategorized, amazon, cloud, analytics, hadoop, big data, NoSQL, newsql, cloud stores, nosql databases, Alan Morrison, Center for Technology and Innovation, PwC

“Leading enterprises have a firm grasp of the technology edge that’s relevant to them. Better data analysis and disambiguation through semantics is central to how they gain competitive advantage today.”–Alan Morrison.

I have interviewed Alan Morrison, senior research fellow at PwC, Center for Technology and Innovation.
Main topic of the interview is how the Big Data market is evolving.

RVZ

Q1. How do you see the Big Data market evolving?

Alan Morrison: We should note first of all how true Big Data and analytics methods emerged and what has been disruptive. Over the course of a decade, web companies have donated IP and millions of lines of code that serves as the foundation for what’s being built on top. In the …

[Read more]

Jan

2016

How to Deploy a Cluster

Posted by Valerie Parham-Thompson of The Pythian Group on Tue 05 Jan 2016 18:15 UTC
Tags:

cluster, hadoop, cloudera, big data, Hive, Technical Track, co-op

In this blog post I will talk about how to deploy a cluster, the methods I tried and my solution to resolving the prerequisites problem.

I’m fairly new to the big data field. Learning about Hadoop, I kept hearing the term “clusters”, deploying a cluster, and installing some services on namenode, some on datanode and so on. I also heard about Cloudera manager which helps me to deploy services on my cluster, so I set up a VM and followed several tutorials including the Cloudera documentation to install cloudera manager. However, every time I reached the “cluster installation” step my installation failed. I later found out that there are several prerequisites for a Cloudera Manager Installation, which was the reason for the failure to install.

Deploy a Cluster

Though I discuss 3 other methods in detail, ultimately I recommend method …

[Read more]

Dec

2015

New VMware Continuent 5.0 – A powerful and cost efficient Oracle GoldenGate alternative!

Posted by Petri Virsunen of Continuent on Mon 21 Dec 2015 21:57 UTC
Tags:

Oracle, cloud, VMWare, big data, MySQL, Apache Hadoop, Data Analytics, database replication, Amazon Redshift, HP Vertica

VMware Continuent 5.0 is a complete data replication solution that includes all the functionality you need at one low price. In this webinar-on-demand, you’ll see how VMware Continuent delivers: Migration. Replicate from an old version of Oracle, often running on non-Linux platform (Windows, AIX, HP-UX, Solaris), to a new version of Oracle (often running in Linux). VMware Continuent supports

Nov

2015

Big Data: InfiniDB vs Spider: What else ?

Posted by Stephane Varoqui on Thu 12 Nov 2015 17:55 UTC
Tags:

big data, mariadb, spider, parallel query

Many of my recent engagements have been all around strategy to implement Real Time Big Data Analytics: Computing hardware cost of extending a single table collection with MariaDB and Parallel Query found in the Spider storage engine to offload columnar MPP storage like InfiniDB or Vertica.

As of today Parallel Query is only available from releases of MariaDB Spider supported by spiral arms. The more efficient way to use parallel query with Spider can be done on group by, and count queries that use a single spider table. In such case Spider Engine will execute query push down AKA map reduce.

Spider gets multiple levels of parallel execution for a single partitioned tables.

First level is per backend server:
The way to actually tell spider to scan different backends in concurrency is to set spider_sts_bg_mode=1

Other level is per …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links