Showing entries 1 to 10 of 17
7 Older Entries »
Displaying posts with tag: ClickHouse (reset)
My 2019 Database Wishlist

Last year I published my 2018 Database Wishlist, which I recently revisited to check what happened and what didn’t. Time for a 2019 wishlist.

I am not going to list items from my 2018 list, even if they didn’t happen or they partially happened. Not because I changed my mind about their importance. Just because I wrote about them recently, and I don’t want to be more boring than I usually am.

External languages for MySQL and MariaDB

MariaDB 10.3 implemented a parser for PL/SQL stored procedures. This could be good for their business, as it facilitates the migration from Oracle. But …

[Read more]
Replicating data into Clickhouse

Clickhouse is a relatively new analytics and datawarehouse engine that provides for very quick insertion and analysing of data. Like most analytics platforms it’s built on a column-oriented storage basis and unlike many alternatives is completely open source. It’s also exceedingly fast, even on relatively modest platforms.

Clickhouse does have some differences from some other environments, for example, data inserted cannot easily be updated, and it supports a number of different storage and table engine formats that are used to store and index the information. So how do we get into that from our MySQL transactional store?

Well, you can do dumps and loads, or you could use Tungsten Replicator to do that for you. The techniques I’m going to describe here are not in an active release, but use the same principles as other part of our data loading.

We’re going to use the CSV-based batch loading system that is …

[Read more]
Percona Live Europe Presents: ClickHouse at Messagebird: Analysing Billions of Events in Real-Time*

We’ll look into how Clickhouse allows us to ingest a large amount of data and run complex analytical interactive queries at MessageBird,. We also present the business needs that brought ClickHouse to our attention and detail the journey to its deployment. We cover the problems we faced, and how we dealt with them. We talk about our current Cloud production setup and how we deployed and use it.

We are really enthusiastic to share a use case of Clickhouse, how it helped us to scale our analytics stack with the good, the bad and the ugly.

The talk could be useful to newcomers and everyone wondering if Clickhouse could be useful to them.

What we’re looking forward to…

There are many talks, but these are among the top ones we’re looking forward to in particular:

[Read more]
ClickHouse: Two Years!

Following my post from a year ago https://www.percona.com/blog/2017/07/06/clickhouse-one-year/, I wanted to review what happened in ClickHouse during this year.
There is indeed some interesting news to share.

1. ClickHouse in DB-Engines Ranking. It did not quite get into the top 100, but the gain from position 174 to 106 is still impressive. Its DB-Engines Ranking score tripled from 0.54 last September to 1.57 this September

And indeed, in my conversation with customers and partners, the narrative has changed from: “ClickHouse, what is …

[Read more]
Easy and Effective Way of Building External Dictionaries for ClickHouse with Pentaho Data Integration Tool

In this post, I provide an illustration of how to use Pentaho Data Integration (PDI) tool to set up external dictionaries in MySQL to support ClickHouse. Although I use MySQL in this example, you can use any PDI supported source.

ClickHouse

ClickHouse is an open-source column-oriented DBMS (columnar database management system) for online analytical processing. Source: wiki.

Pentaho Data Integration

Information from the Pentaho wiki: Pentaho Data Integration (PDI, also called Kettle) is the component of Pentaho responsible for the Extract, Transform and Load (ETL) processes. Though ETL tools are most frequently used in data warehouses environments, PDI can also be used for other purposes:

  • Migrating data between …
[Read more]
Analyze Your Raw MySQL Query Logs with ClickHouse

In this blog post, I’ll look at how you can analyze raw MySQL query logs with ClickHouse.

For typical query performance analyses, we have an excellent tool in Percona Monitoring and Management. You may want to go deeper, though. You might be longing for the ability to query raw MySQL “slow” query logs with SQL.

There are a number of tools to load the MySQL slow query logs to a variety of data stores. For example, you can find posts showing how to do it with LogStash. While very flexible, these solutions always look too complicated and limited in functionality to me.   

By far the best solution to parse and load MySQL slow query logs (among multiple log types supported) is Charity …

[Read more]
Archiving MySQL Tables in ClickHouse

In this blog post, I will talk about archiving MySQL tables in ClickHouse for storage and analytics.

Why Archive?

Hard drives are cheap nowadays, but storing lots of data in MySQL is not practical and can cause all sorts of performance bottlenecks. To name just a few issues:

  1. The larger the table and index, the slower the performance of all operations (both writes and reads)
  2. Backup and restore for terabytes of data is more challenging, and if we need to have redundancy (replication slave, clustering, etc.) we will have to store all the data N times

The answer is archiving old data. Archiving does not necessarily mean that the data will be permanently removed. Instead, the archived data can be placed into long-term storage (i.e., AWS S3) or loaded into a …

[Read more]
Updating/Deleting Rows From Clickhouse (Part 2)

In this post, we’ll look at updating and deleting rows with ClickHouse. It’s the second of two parts.

In the first part of this post, we described the high-level overview of implementing incremental refresh on a ClickHouse table as an alternative support for UPDATE/DELETE. In this part, we will show you the actual steps and sample code.

Prepare Changelog Table

First, we create the changelog table below. This can be stored on any other MySQL instance separate from the source of our analytics table. When we run the change capture script, it will record the data on this table that we can consume later with the incremental refresh script:

CREATE TABLE `clickhouse_changelog` (
  `db` varchar(255) NOT NULL …
[Read more]
Updating/Deleting Rows with ClickHouse (Part 1)

In this post, we’ll look at updating and deleting rows with ClickHouse. It’s the first of two parts.

Update: Part 2 of this post is here.

ClickHouse is fast – blazing fast! It’s quite easy to pick up, and with ProxySQL integrating with existing applications already using MySQL, it’s way less complex than using other analytics options. However, ClickHouse does not support UPDATE/DELETE (yet). That entry barrier can easily dissuade potential users despite the good things I mentioned.

If there is a will, there is a way! We have so far taken advantage of the new feature that supports more granular partitioning strategy (by week, by day or something else). With more …

[Read more]
ClickHouse MySQL Silicon Valley Meetup Wednesday, October 25 at Uber Engineering with Percona’s CTO Vadim Tkachenko

I will be presenting at the ClickHouse MySQL Silicon Valley Meetup on Wednesday, October 25, 2017, at 6:30 PM.

ClickHouse is a real-time analytical database system. Even though they’re only celebrating one year as open source software, it has already proved itself ready for the serious workloads. We will talk about ClickHouse in general, some internals and why it is so fast. ClickHouse works in conjunction with MySQL – traditionally weak for analytical workloads – and this presentation demonstrates how to make the two systems work together.

My talk will cover how we can improve the experience with real-time analytics using ClickHouse, and how we can …

[Read more]
Showing entries 1 to 10 of 17
7 Older Entries »