Showing entries 1 to 10 of 40
10 Older Entries »
Displaying posts with tag: design (reset)
Improving the design of MySQL replication

Now that MySQL 8.0 has been revealed, it's time to take a deep look at replication features in the latest releases, and review its overall design.

Server UUID vs Server-ID

At the beginning of replication, there was the server_id variable that identified uniquely a node in a replication system. The variable is still here, but in MySQL 5.6 it was joined by another value, which is created during the server initialisation, regardless of its involvement in a replication system. The server_uuid is a string of hexadecimal characters that is the basis for global transaction identifiers:

select @@server_id, @@server_uuid;
| @@server_id | @@server_uuid |
+-------------+--------------------------------------+ …
[Read more]
How to structure and design a relational database to support you data storage needs?

Well, every now and then, when we began to start a new project or app, which has some data storage requirement, we have a deep intriguing thought as to how best represent the data structure so as to support a variety of needs including but not limited to (ACID rules):

1. Normalization
2. Reliability
3. Consistency
4. And many others

Below, I provide a set of steps which you can follow to arrive at a data model that correctly suites your requirements.


1. Identify the project or app requirements / specifications and business rules which tell you what your app will be able to do when it is ready.
2. From these business rules, identify possible objects for each business rule and mark them in a paper using rectangular sections like authors, posts etc.
3. Once you have recognized the …

[Read more]
Reports exaggerated

I've been letting the blog rest recently, and not so recently as well.  The problem is not a lack of subjects, but a lack of time to do them any justice.  However it is quite sad to see that my last entry was in September 2012, so it is time to post again.

Of late I have been pondering what I have to say about :

  • Distributed MVCC and write-scaling
  • Different approaches to eventual consistency with replicated RDBMS
  • Various MySQL Cluster related topics
  • Various general rambling and unstructured topics

However, these will take some time to percolate and calcify.

In the meantime here are some things I have found interesting recently :

[Read more]
Parallel replication: off by one

One of the most common errors in development is where a loop or a retrieval by index falls short or long by one unit, usually because of an oversight or a logic in coding.

Of the following snippets, which one will run 10 times?

/* #1 */    for (N = 0 ; N < 10; N++) printf("%d\n", N);

/* #2 */ for (N = 0 ; N <= 10; N++) printf("%d\n", N);

/* #3 */ for (N = 1 ; N <= 10; N++) printf("%d\n", N);

/* #4 */ for (N = 1 ; N < 10; N++) printf("%d\n", N);

The question is deceptive, as there are two snippets that will run 10 times (1 and 3). But they will print different numbers. If you ware aiming for numbers from 1 to 10, only #3 is good.

After many years of programming, off-by-one errors are rare in my code, and I have been able to spot them or prevent them at first sight. That’s why I feel uneasy when I look at the way parallel replication is enabled in …

[Read more]
MySQL Workbench 6.0 - New Design and Many Enhancements

New GUI, 30+ New Features, and Major New Components

Oracle is excited to announce the immediate availability of the production-read, GA release of MySQL Workbench 6.0, available for download under the GPL, as well as part of the Commercial MySQL Standard, Enterprise, and Cluster Carrier Grade Editions with 24x7 global support.

The need by database professionals for management tools has increased with expanding data volumes, web, cloud and mobile computing growth. Improvement and additions in MySQL Workbench helps developers and administrators better manage these dynamic data environments. This latest GA release includes many new features and a modernized user interface that allows users to simplify MySQL database development, design and administration.

The goal of MySQL Workbench 6.0 is to simplify and improve the workflow in the graphical user interface (GUI) as well as adding new features and …

[Read more]
Implementing efficient Geo IP location system in MySQL

Often application needs to know where a user is physically located. The easiest way to figure that out is by looking up their IP address in a special database. It can all be implemented in MySQL, but I often see it done inefficiently. In my post I will show how to implement a complete solution that offers great performance.

Importing Geo IP data

First you will require a database mapping network addresses to real locations. There are various resources available, but I chose the one nginx web server uses with its geoip module. GeoLite City comes in CSV format and is available for download with no charge from MaxMind.

The archive contains two files. GeoLiteCity-Blocks.csv lists all IP ranges and maps each one to the corresponding location identifier, while …

[Read more]
(My)SQL mistakes. Do you use GROUP BY correctly?

Often I see a SQL problem solved incorrectly and I do not mean inefficiently. Simply incorrectly. In many cases the developer remains unaware that they aren’t getting the results they were expecting or even if a result is correct, it is only by chance, for example because the database engine was smart enough to figure out some non-sense in a query. In a few posts I will try to disclose some of the more common problems.

Aggregate with GROUP BY

Unlike many other database systems, MySQL actually permits that an aggregate query returns columns not used in the aggregation (i.e. not listed in GROUP BY clause). It could be considered as flexibility, but in practice this can easily lead to mistakes if a person that designs queries does not understand how they will be executed. For example, what values an aggregate query returns for a column that wasn’t part of the grouping key?

mysql> SELECT user_id, id, …
[Read more]
The CAP theorem and MySQL Cluster

tldr; A single MySQL Cluster prioritises Consistency in Network partition events. Asynchronously replicating MySQL Clusters prioritise Availability in Network partition events.

I was recently asked about the relationship between MySQL Cluster and the CAP theorem. The CAP theorem is often described as a pick two out of three problem, such as choosing from good, cheap, fast. You can have any two, but you can't have all three. For CAP the three qualities are 'Consistency', 'Availability' and 'Partition tolerance'. CAP states that in a system with data replicated over a network only two of these three qualities can be maintained at once, so which two does MySQL Cluster provide?

Standard 'my interpretation of CAP' section

Everyone who discusses CAP like to rehash it, and I'm no exception. …

[Read more]
One billion

As always, I am a little late, but I want to jump on the bandwagon and mention the recent MySQL Cluster milestone of passing 1 billion queries per minute. Apart from echoing the arbitrarily large ransom demand of Dr Evil, what does this mean?

Obviously 1 billion is only of interest to us humans as we generally happen to have 10 fingers, and seem to name multiples in steps of 10^3 for some reason. Each processor involved in this benchmark is clocked at several billion cycles per second, so a single billion is not so vast or fast.

Measuring over a minute also feels unnatural for a computer performance benchmark - we are used to lots of things happening every second! A minute is a long time in silicon.

What's more, these …

[Read more]
Eventual Consistency in MySQL Cluster - implementation part 3

As promised, this is the final post in a series looking at eventual consistency with MySQL Cluster asynchronous replication. This time I'll describe the transaction dependency tracking used with NDB$EPOCH_TRANS and review some of the implementation properties.

Transaction based conflict handling with NDB$EPOCH_TRANS

NDB$EPOCH_TRANS is almost exactly the same as NDB$EPOCH, except that when a conflict is detected on a row, the whole user transaction which made the conflicting row change is marked as conflicting, along with any dependent transactions. All of these rejected row operations are then handled using inserts to an exceptions table and realignment operations. This helps avoid the row-shear problems described …

[Read more]
Showing entries 1 to 10 of 40
10 Older Entries »