Showing entries 1 to 10 of 21
10 Older Entries »
Displaying posts with tag: ClickHouse (reset)
Database Replication from MySQL to ClickHouse for High Performance WebScale Analytics

MySQL to ClickHouse Replication 

MySQL works great for Online Transaction Processing (OLTP) systems, MySQL performance degrades with analytical queries on very large database infrastructure, I agree you can optimize MySQL query performance with InnoDB compressions but why then combine OLTP and OLAP (Online Analytics Processing Systems) when you have columnar stores which can deliver high performance analytical queries more efficiently? I have seen several companies building dedicated MySQL servers for Analytics but over the period of time they end spending more money in fine tuning MySQL for Analytics with no significant improvements, There is no point in blaming MySQL for what it is not built for, MySQL / MariaDB is any day a bad choice for columnar analytics / big data solutions.  Columnar database systems are best suited for handling large quantities of data: data stored in columns typically is easier to compress, it is …

[Read more]
2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

Ready to transition from a commercial database to open source, and want to know which databases are most popular in 2019? Wondering whether an on-premise vs. public cloud vs. hybrid cloud infrastructure is best for your database strategy? Or, considering adding a new database to your application and want to see which combinations are most popular? We found all the answers you need at the Percona Live event last month, and broke down the insights into the following free trends reports:

[Read more]
MariaDB Track at Percona Live

Less than one month left until Percona Live. This time the Committee work was a bit unusual. Instead of having one big committee for the whole conference we had a few mini-committees, each responsible for a track. Each independent mini-committee, in turn, had a leader who was responsible for the whole process. I led the MariaDB track. In this post, I want to explain how we worked, which topics we have chosen, and why.

For MariaDB, we had seven slots: five for 50-minutes talks, two for 25-minutes talks and 19 submissions. We had to reject two out of three proposals. We also had to decide how many topics the program should cover. My aim here was to use the MariaDB track to demonstrate as many MariaDB unique features as …

[Read more]
ClickHouse Performance Uint32 vs Uint64 vs Float32 vs Float64

While implementing ClickHouse for query executions statistics storage in Percona Monitoring and Management (PMM),  we were faced with a question of choosing the data type for metrics we store. It came down to this question: what is the difference in performance and space usage between Uint32, Uint64, Float32, and Float64 column types?

To test this, I created a test table with an abbreviated and simplified version of the main table in our ClickHouse Schema.

The “number of queries” is stored four times in four different columns to be able to benchmark queries referencing different columns.  We can do this with ClickHouse because it is a column store and it works only with columns referenced by the query. This method would not be appropriate for testing on …

[Read more]
My 2019 Database Wishlist

Last year I published my 2018 Database Wishlist, which I recently revisited to check what happened and what didn’t. Time for a 2019 wishlist.

I am not going to list items from my 2018 list, even if they didn’t happen or they partially happened. Not because I changed my mind about their importance. Just because I wrote about them recently, and I don’t want to be more boring than I usually am.

External languages for MySQL and MariaDB

MariaDB 10.3 implemented a parser for PL/SQL stored procedures. This could be good for their business, as it facilitates the migration from Oracle. But …

[Read more]
Replicating data into Clickhouse

Clickhouse is a relatively new analytics and datawarehouse engine that provides for very quick insertion and analysing of data. Like most analytics platforms it’s built on a column-oriented storage basis and unlike many alternatives is completely open source. It’s also exceedingly fast, even on relatively modest platforms.

Clickhouse does have some differences from some other environments, for example, data inserted cannot easily be updated, and it supports a number of different storage and table engine formats that are used to store and index the information. So how do we get into that from our MySQL transactional store?

Well, you can do dumps and loads, or you could use Tungsten Replicator to do that for you. The techniques I’m going to describe here are not in an active release, but use the same principles as other part of our data loading.

We’re going to use the CSV-based batch loading system that is …

[Read more]
Percona Live Europe Presents: ClickHouse at Messagebird: Analysing Billions of Events in Real-Time*

We’ll look into how Clickhouse allows us to ingest a large amount of data and run complex analytical interactive queries at MessageBird,. We also present the business needs that brought ClickHouse to our attention and detail the journey to its deployment. We cover the problems we faced, and how we dealt with them. We talk about our current Cloud production setup and how we deployed and use it.

We are really enthusiastic to share a use case of Clickhouse, how it helped us to scale our analytics stack with the good, the bad and the ugly.

The talk could be useful to newcomers and everyone wondering if Clickhouse could be useful to them.

What we’re looking forward to…

There are many talks, but these are among the top ones we’re looking forward to in particular:

[Read more]
ClickHouse: Two Years!

Following my post from a year ago https://www.percona.com/blog/2017/07/06/clickhouse-one-year/, I wanted to review what happened in ClickHouse during this year.
There is indeed some interesting news to share.

1. ClickHouse in DB-Engines Ranking. It did not quite get into the top 100, but the gain from position 174 to 106 is still impressive. Its DB-Engines Ranking score tripled from 0.54 last September to 1.57 this September

And indeed, in my conversation with customers and partners, the narrative has changed from: “ClickHouse, what is …

[Read more]
Easy and Effective Way of Building External Dictionaries for ClickHouse with Pentaho Data Integration Tool

In this post, I provide an illustration of how to use Pentaho Data Integration (PDI) tool to set up external dictionaries in MySQL to support ClickHouse. Although I use MySQL in this example, you can use any PDI supported source.

ClickHouse

ClickHouse is an open-source column-oriented DBMS (columnar database management system) for online analytical processing. Source: wiki.

Pentaho Data Integration

Information from the Pentaho wiki: Pentaho Data Integration (PDI, also called Kettle) is the component of Pentaho responsible for the Extract, Transform and Load (ETL) processes. Though ETL tools are most frequently used in data warehouses environments, PDI can also be used for other purposes:

  • Migrating data between …
[Read more]
Analyze Your Raw MySQL Query Logs with ClickHouse

In this blog post, I’ll look at how you can analyze raw MySQL query logs with ClickHouse.

For typical query performance analyses, we have an excellent tool in Percona Monitoring and Management. You may want to go deeper, though. You might be longing for the ability to query raw MySQL “slow” query logs with SQL.

There are a number of tools to load the MySQL slow query logs to a variety of data stores. For example, you can find posts showing how to do it with LogStash. While very flexible, these solutions always look too complicated and limited in functionality to me.   

By far the best solution to parse and load MySQL slow query logs (among multiple log types supported) is Charity …

[Read more]
Showing entries 1 to 10 of 21
10 Older Entries »