Introducing schemadiff, an internal library in Vitess that has been one of its best-kept secrets until now. At its core, schemadiff is a declarative, programmatic library that can produce a diff in SQL format of two entities: tables, views, or full blown database schemas. But it then goes beyond that to normalize, validate, export, and even apply schema changes, all declaratively and without having to use a MySQL server. Let's dive in to understand its functionality and capabilities.
10 Older Entries »
In the last quarter of 2021, AWS released Aurora version 3. This new version aligns Aurora with the latest MySQL 8 version, porting many of the advantages MySQL 8 has over previous versions.
While this brings a lot of new interesting features for Aurora, what we are going to cover here is to see how DDLs behave when using the ONLINE option. With a quick comparison with what happens in MySQL 8 standard and with Group Replication.
All tests were run on an Aurora instance r6g.large with a secondary availability zone. The test was composed of:
- #1 to perform DDL
- #2 to perform insert data in the table I am altering
- #3 to perform insert data on a different table
- #4 checking the other node operations
In the Aurora instance, a …[Read more]
Vitess supports managed, non-blocking schema migrations based on VReplication, aptly named vitess migrations. Vitess migrations are powerful, revertible, and failure agnostic. They take an asynchronous approach, which is more lightweight on the database server. The asynchronous approach comes with an implementation challenge: how to cut-over with minimal impact to the user/app and risk free of data loss. In this post we take a deep dive into the cut-over logic used in vitess migrations.
In April 2021, I wrote an article about Online DDL and Group Replication. At that time we were dealing with MySQL 8.0.23 and also opened a bug report which did not have the right answer to the case presented.
Anyhow, in that article I have shown how an online DDL was de facto locking the whole cluster for a very long time even when using the consistency level set to EVENTUAL.
This article is to give justice to the work done by the MySQL/Oracle engineers to correct that annoying inconvenience.
Before going ahead, let us remember how an Online DDL was propagated in a group replication cluster, and identify the differences with what happens now, all with the consistency level set to EVENTUAL ( …[Read more]
When using PXC, the cluster relies on the wsrep_OSU_method parameter to define the Online Schema Upgrade (OSU) method the node uses to replicate DDL statements.
Until now, we normally have three options:
- Use Total Isolation Order (TOI, the default)
- Use Rolling Schema Upgrade (RSU)
- Use Percona’s online schema change tool (TOI + PTOSC)
Each method has some positive and negative aspects. TOI will lock the whole …[Read more]
Query planning is hard # Have you ever wondered what goes on behind the scenes when you execute a SQL query? What steps are taken to access your data? In this article, I'll talk about the history of Vitess's V3 query planner, why we created a new query planner, and the development of the new Gen4 query planner. Vitess is a horizontally scalable database solution which means that a single table can be spread out across multiple database instances.
Originally posted at Andres's blog. Traditional query optimizing is mostly about two things: first, in which order and from where to access data, and then how to then combine it. You have probably seen the tree shapes execution plans that are produced from query planning. I’ll use an example from the MySQL docs, using FORMAT=TREE which was introduced in MySQL 8.0: mysql>EXPLAINFORMAT=TREE->SELECT*->FROMt1->JOINt2->ON(t1.c1=t2.c1ANDt1.c2<t2.c2)->JOINt3->ON(t2.c1=t3.c1)\G***************************1.row***************************EXPLAIN:->Innerhashjoin(t3.c1=t1.c1)(cost=1.05rows=1)->Tablescanont3(cost=0.35rows=1)->Hash->Filter:(t1.c2<t2.c2)(cost=0.70rows=1)->Innerhashjoin(t2.c1=t1.c1)(cost=0.70rows=1)->Tablescanont2(cost=0.35rows=1)->Hash->Tablescanont1(cost=0.35rows=1)Here we can see that the MySQL optimizer thinks the best plan is to start reading from t1 using a table scan.
This post explains the inherent problem of running online schema changes in MySQL, on tables participating in a foreign key relationship. We'll lay some ground rules and facts, sketch a simplified schema, and dive into an online schema change operation. Our discussion applies to gh-ost, pt-online-schema-change, and VReplication based migrations, or any other online schema change tool that works with a shadow/ghost table like the Facebook tools. Why Online DDL? # Online schema change tools come as workarounds to an old problem: schema migrations in MySQL were blocking, uninterruptible, aggressive in resources, replication unfriendly.
Cross posting link Golang is a wonderful language. It's simple, and most of the time not confusing or surprising. This makes it easy to jump into library code and start reading and quickly understand what's going on. On the other hand, coming from other languages, there are a few features that would make our lives easier. We are building Vitess using mostly golang, and most of us are happy with this choice.
Vitess introduces a new way to run schema migrations: non-blocking, asynchronous, scheduled online DDL. With online DDL Vitess simplifies the schema migration process by taking ownership of the operational overhead, and providing the user a simple, familiar interface: the standard ALTER TABLE statement. Let’s first give some background and explain why schema migrations are such an issue in the databases world, and then dive into implementation details The relational model and the operational overhead # The relational model is one of the longest surviving models in the software world, introduced decades ago and widely used until today.
10 Older Entries »