Showing entries 1 to 9
Displaying posts with tag: row (reset)
Is MySQL Statement-Based / Mixed Replication Really Safe?

The binary logging format in MySQL has been ROW by default since MySQL 5.7, yet there are still many users sticking with STATEMENT or MIXED formats for various reasons. In some cases, there is just simple hesitation from changing something that has worked for years on legacy applications. But in others, there may be serious blockers, most typically missing primary keys in badly designed schemas, which would lead to serious performance issues on the replicas.

As a Support Engineer, I can still see quite a few customers using STATEMENT or MIXED formats, even if they are already on MySQL 8.0. In many cases this is OK, but recently I had to deal with a pretty nasty case, where not using ROW format was found to cause the replicas to silently lose data updates, without raising any replication errors! Was it some really …

[Read more]
The Fast Way to Import CSV Data Into a Tungsten Cluster

The Question Recently, a customer asked us:

After importing a new section of user data into our Tungsten cluster, we are seeing perpetually rising replication lag. We are sitting at 8.5hrs estimated convergence time after importing around 50 million rows and this lag is climbing continuously. We are currently migrating some of our users from a NoSQL database into our Tungsten cluster. We have a procedure to write out a bunch of CSV files after translating our old data into columns and then we recursively send them to the write master using the mysql client. Specifically our import SQL is doing LOAD DATA LOCAL INFILE and the reading in a large CSV file to do the import. We have 20k records per CSV file and we have 12 workers which insert them in parallel.

Simple Overview The Skinny

In cases like this, the slaves are having trouble with the database unable to keep up with the apply stage …

[Read more]
A Nice Introduction to MySQL Window Functions III

Windowing Functions can get quite complex very quickly when you start taking advantage of the frame clause. Ranges and rows can get confusing.  So for review lets look at how the specification looks:

Window_spec:
   [window name] [partition clause] [order clause] [frame clause]

That looks simple. And them come terms like UNBOUNDED PRECEDING that could put a knot in your gut.  The manual is not exactly written to help novices in this area get up to speed.  But don't panic.  If you work through the examples that follow (and please do the preceding part of this series before trying these examples) you will have a better appreciation of what is going on with window function.

The Frame Clause
So the frame clause is optional in the window function.  A frame is considered a subset of the current partition and defines that subset.  Frames are determined with …

[Read more]
Row Store and Column Store Databases

In this blog post, we’ll discuss the differences between row store and column store databases.

Clients often ask us if they should or could be using columnar databases. For some applications, a columnar database is a great choice; for others, you should stick with the tried and true row-based option.

At a basic level, row stores are great for transaction processing. Column stores are great for highly analytical query models. Row stores have the ability to write data very quickly, whereas a column store is awesome at aggregating large volumes of data for a subset of columns.

One of the benefits of a columnar database is its crazy fast query speeds. In some cases, queries that took minutes or hours are completed in seconds. This makes columnar databases a good choice in a query-heavy …

[Read more]
InnoDB Redundant Row Format

Introduction

This article describes the InnoDB redundant row format. If you are new to InnoDB code base (a new developer starting to work with InnoDB), then this article is for you. I'll explain the row format by making use of a gdb session. An overview of the article is given below:

  • Create a simple table and populate few rows.
  • Access the page that contains the rows inserted.
  • Access a couple of rows and explain its format.
  • Give summary of redundant row format.
  • Useful gdb commands to analyse the InnoDB rows.
  • Look at a GNU Emacs Lisp function to traverse rows in an InnoDB index page.

To get the most out of this article, the reader is expected to repeat the gdb session as described here.

The Schema

Consider the following SQL statements to produce the schema:

CREATE TABLE t1 (f1 int unsigned) row_format=redundant …
[Read more]
Temporary Tables and Replication

I recently wrote about non-deterministic queries in the replication stream. That’s resolved by using either MIXED or ROW based replication rather than STATEMENT based.

Another thing that’s not fully handled by STATEMENT based replication is temporary tables. Imagine the following:

  1. Master: CREATE TEMPORARY TABLE rpltmpbreak (i INT);
  2. Wait for slave to replicate this statement, then stop and start mysqld (not just STOP/START SLAVE)
  3. Master: INSERT INTO rpltmpbreak VALUES (1);
  4. Slave: SHOW SLAVE STATUS \G

If for any reason a slave server shuts down and restarts after the temp table creation, replication will break because the temporary table will no longer exist on the restarted slave server. It’s obvious when you think about it, but nevertheless it’s quite …

[Read more]
Non-Deterministic Query in Replication Stream

You might find a warning like the below in your error log:

130522 17:54:18 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. Statements writing to a table with an auto-increment column after selecting from another table are unsafe because the order in which rows are retrieved determines what (if any) rows will be written. This order cannot be predicted and may differ on master and the slave.
Statement: INSERT INTO tbl2 SELECT * FROM tbl1 WHERE col IN (417,523)

What do MariaDB and MySQL mean with this warning? The server can’t guarantee that this exact query, with STATEMENT based replication, will always yield identical results on the slave.

Does that mean that you have to use ROW based (or MIXED) replication? Possibly, but not necessarily.

For this type of query, it primarily refers to the fact that without …

[Read more]
[MySQL] Deleting/Updating Rows Common To 2 Tables – Speed And Slave Lag Considerations

Introduction

A question I recently saw on Stack Overflow titled Faster way to delete matching [database] rows? prompted me to organize my thoughts and observations on the subject and quickly jot them down here.

Here is the brief description of the task: say, you have 2 MySQL tables a and b. The tables contain the same type of data, for example log entries. Now you want to delete all or a subset of the entries in table a that exist in table b.

Solutions Suggested By Others

DELETE FROM a WHERE EXISTS (SELECT b.id FROM b WHERE b.id = a.id);
DELETE a FROM a INNER JOIN b on a.id=b.id;
DELETE FROM a WHERE id IN (SELECT id FROM b)

The Problem With Suggested Solutions

Solutions above are all fine if the tables are quite small and the …

[Read more]
Will you use row-based replication by default?



MySQL 5.1 introduces row based replication, a way of replicating data that fixes many inconsistencies of the statement based replication, the standard method used by MySQL so far.


The good: row based replication solves some problems when replicating the result of non deterministic functions, such as UUID() or NOW().
The bad: row-based replication may break existing applications, where you count on the quirks of statement based replication to execute conditionally (updates base on @@server_id, for example), and may perform badly on updates applied to very large tables.

[Read more]
Showing entries 1 to 9