Showing entries 8181 to 8190 of 44865
« 10 Newer Entries | 10 Older Entries »
MySQL Document Store: unstructured data, unstructured search

Storing documents is convenient: no need to define a schema up-front, no downtime for schema changes, no normalization, no slow joins – true or not, you name it. But what about search if you do not know how your data is structured? For example, the famous SFW-construct requires listing the columns to search: SELECT … FROM … WHERE some_column = ‘Jippie’ . Given that JSON data can have no schema how to write a query without knowing any field names, where is SELECT … FROM … WHERE * = ‘Jippie’? JSON_SEARCH() gets you started but there are gaps if you think about it for a minute.

There are many reasons why you may not know the structure of the data you operate on. Maybe, you have gathered documents using different “schema versions” over time, maybe, there is simply no common structure because the data comes from …

[Read more]
How Apache Spark makes your slow MySQL queries 10x faster (or more)

In this blog post, we’ll discuss how to improve the performance of slow MySQL queries using Apache Spark.

Introduction

In my previous blog post, I wrote about using Apache Spark with MySQL for data analysis and showed how to transform and analyze a large volume of data (text files) with Apache Spark. Vadim also performed a benchmark comparing performance of MySQL and Spark with Parquet columnar format (using Air traffic performance data). That works great, but what if we don’t want to move our data from MySQL to another storage (i.e., …

[Read more]
MariaDB Galera Cluster 5.5.51 and Connector/J 1.5.1 now available

The MariaDB project is pleased to announce the immediate availability of MariaDB Galera Cluster 5.5.51 Stable (GA), and MariaDB Connector/J 1.5.1 Release Candidate (RC). See the release notes and changelogs for details on these releases. Download MariaDB Galera Cluster 5.5.51 Release Notes Changelog What is MariaDB Galera Cluster? MariaDB APT and YUM Repository Configuration Generator […]

The post MariaDB Galera Cluster 5.5.51 and Connector/J 1.5.1 now available appeared first on MariaDB.org.

Trying out MySQL in Docker Swarm Mode

Orchestration tools are often used when scaling out an application stack. In a Docker environment, tools like Kubernetes, Mesos and Docker Swarm have typically been used for this purpose. Docker has brought significant updates to their orchestration offering with their latest release. In this blog post, we’ll give a contextual overview of the orchestration features offered in […]

What’s next

I received an overwhelming number of comments when I said I was leaving MariaDB Corporation. Thank you – it is really nice to be appreciated.

I haven’t left the MySQL ecosystem. In fact, I’ve joined Percona as their Chief Evangelist in the CTO Office, and I’m going to focus on the MySQL/Percona Server/MariaDB Server ecosystem, while also looking at MongoDB and other solutions that are good for Percona customers. Thanks again for the overwhelming response on the various social media channels, and via emails, calls, etc.

Here’s to a great time at Percona to focus on open source databases and solutions around them!

My first blog post on the Percona blog – I’m Colin Charles, and I’m here to evangelize …

[Read more]
Context aware MySQL pools via HAProxy

At GitHub we use MySQL as our main datastore. While repository data lies in git, metadata is stored in MySQL. This includes Issues, Pull Requests, Comments etc. We also auth against MySQL via a custom git proxy (babeld). To be able to serve under the high load GitHub operates at, we use MySQL replication to scale out read load.

We have different clusters to provide with different types of services, but the single-writer-multiple-readers design applies to them all. Depending on growth of traffic, on application demand, on operational tasks or other constraints, we take replicas in or out of our pools. Depending on workloads some replicas may lag more than others.

Displaying up-to-date data is important. We have tooling that helps us ensure we keep replication lag at a minimum, and typically it doesn’t exceed 1

[Read more]
Context aware MySQL pools via HAProxy

At GitHub we use MySQL as our main datastore. While repository data lies in git, metadata is stored in MySQL. This includes Issues, Pull Requests, Comments etc. We also auth against MySQL via a custom git proxy (babeld). To be able to serve under the high load GitHub operates at, we use MySQL replication to scale out read load.

We have different clusters to provide with different types of services, but the single-writer-multiple-readers design applies to them all. Depending on growth of traffic, on application demand, on operational tasks or other constraints, we take replicas in or out of our pools. Depending on workloads some replicas may lag more than others.

Displaying up-to-date data is important. We have tooling that helps us ensure we keep replication lag at a minimum, and typically it doesn’t exceed 1

[Read more]
Tracker: Ingesting MySQL data at scale - Part 2

In Part 1 we discussed our existing architecture for ingesting MySQL called Tracker, including its wins, challenges and an outline of the new architecture with a focus on the Hadoop side. Here we’ll focus on the implementation details on the MySQL side. The uploader of data to S3 has been open-sourced as part of the Pinterest MySQL Utils.

Tracker V-0

As a proof of concept, we wrote a hacky 96-line Bash script to unblock backups to Hive for a new data set. The script spawned a bunch of workers that each worked on one database at a time. For each table in the database, it ran SELECT INTO OUTFILE and then uploaded the data to S3. It worked, but BASH… And that just isn’t a long term solution.

Tracker V-1

For our …

[Read more]
Webinar Thursday 8/18: Preventing and Resolving MySQL Downtime

Join Percona’s Jervin Real for a webinar on Thursday August 18, 2016 at 10 am PDT (UTC-7) on Preventing and Resolving MySQL Downtime.

Preventing MySQL downtime and emergencies is difficult. Often complex combinations of several things going wrong cause these emergencies. Without knowledge of the causes of emergencies, preventative proactive measures often fail to prevent further problems — no matter how sincere. This talk discusses some of the ways to prevent real production system emergencies, and suggests specific actions for:

  • Application stack configuration
  • MySQL server configuration
  • Operating system configuration
  • Troublesome server features …
[Read more]
Percona Toolkit 2.2.19 is now available

Percona is pleased to announce the availability of Percona Toolkit 2.2.19.  Released August 16, 2016. Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL server and system tasks that DBAs find too difficult or complex for to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release is the current GA (Generally Available) stable release in the 2.2 series. Downloads are available here and from the Percona Software …

[Read more]
Showing entries 8181 to 8190 of 44865
« 10 Newer Entries | 10 Older Entries »