MySQL offers a wide array of options to configure replication, but with all of those options, how can you be sure you are doing it right? Replication is the first step to providing a higher level of availability to your MySQL database. A well configured replication architecture can be the difference between your data being highly available, or your MySQL setup becoming a management nightmare. At PlanetScale, we support hundreds of thousands of database clusters, all using replication to provide high availability, so we have a little bit of experience in this arena! In this article, we’re going to explore some of the best practices when it comes to replication, both locally and across longer distances. Use an active/passive configuration When replicating with active/passive mode, one MySQL server acts as the source and all other servers are read-only replicas from that source. In this configuration, the replicas can be used to serve up read-only …
[Read more]Have you heard about MySQL replication but aren’t sure exactly why you should care? Having multiple servers for any workload is typically considered best practice. After all, a workload split across multiple servers helps balance out the performance of any application. When it comes to working with your database, though, the benefits may not be as clear. In this article, you’ll learn about five real-world use cases for implementing MySQL replication. What is MySQL replication? Before we get into its use cases, let us briefly describe what MySQL replication is. MySQL replication is a process that is used to keep multiple MySQL servers in sync. When you first set up a MySQL environment, it is typically with a single server to run your databases. One approach to scaling your database environment is to configure additional servers to contain copies of your database (replicas) that match the primary MySQL server (source). As data is updated, written …
[Read more]When the performance of your database server starts to decline from general usage, you have several options for optimizing it. One common method for optimizing your MySQL database is through partitioning. In this article, we’ll cover the basics of MySQL partitioning, how to apply partitioning to your database, and we'll discuss how it’s related to sharding. The basics of MySQL partitioning Partitioning is the idea of splitting something large into smaller chunks. In MySQL, the term “partitioning” means splitting up individual tables of a database. When you partition a table in MySQL, the table is split up into several logical units known as partitions, which are stored separately on disk. When data is written to the table, a partitioning function will be used by MySQL to decide which partition to store the data in. The value for one or more columns in a given row is used for this sorting process. MySQL provides several partitioning functions …
[Read more]Deep dive into MySQL’s connection handling mechanisms for optimal connection pooling and improved concurrency.
Scaling a database presents challenges. As you grow, you might begin having trouble handling ever-increasing throughput or data size. You might find that query latency is getting worse. You might be pushing the limits of your hardware. When this happens, a classic option is vertically scaling your database by getting better hardware, but is there a better way? And what happens when you reach the vertical limits? This is where horizontal sharding comes in. In this article, we'll cover some common indicators that your database may be ready for horizontal sharding. We'll also look at some measures you can implement until then. Let's dig in. Hitting the limits There are lots of different limits that you can run into when you're scaling up. At the database level, you might be maxing out CPU, memory, disk space, or IOPS. Running into these limits can have real consequences for your business. Database operations like schema changes will start taking longer, …
[Read more]For developers building out an application, a transactional datastore is the obvious and proven choice, but with success comes scale limitations. A monolithic database works well initially, but as an application sees growth, the size of its data will eventually grow beyond what is optimal for a single server. Implementing read replicas can improve your performance but will likely add lag between your primary and your replicas, leading to performance or correctness issues for your application. These complexities can sometimes require major architectural changes, leading to a suboptimal user experience and difficult compromises, having to choose between application performance or data consistency. Scaling write traffic is even more challenging; for example, even the largest MySQL database will see performance issues at a certain point. Horizontal sharding This is not a new challenge; organizations have faced it for years, and horizontal sharding is one …
[Read more]Every day PlanetScale processes more than 10 billion of our customers’ queries. We need to collect, store, and serve telemetry data generated by these queries to power Insights, our built-in query performance tool. This post describes how we built a scalable telemetry pipeline using Apache Kafka and a sharded PlanetScale database. Insights requirements To show you Insights, we pull from the following datasets: Database-level time series data (e.g., queries per second across the entire database). Query pattern-level time series data (e.g., p95 for a single query pattern like SELECT email FROM users where id = %) Data on specific query executions for slow/expensive queries (the “slow query log”). The database-level data fits well into a time series database like Prometheus, but we run into several issues when trying to store the query-pattern level data in a time series database. On any given day, there are 10s of millions of unique query …
[Read more]There are several different ways to store dates and times in MySQL, and knowing which one to use requires understanding what you'll be storing and how MySQL handles each type. There are five column types that you can use to store temporal data in MySQL. They are: DATE DATETIME TIMESTAMP YEAR TIME Each column type stores slightly different data, has different minimum and maximum values, and requires different amounts of storage. In the table below, you'll see each column type and their various attributes.| Column | Data | Bytes | Min | Max | |-----------|-------------|-------|---------------------|---------------------| | DATE | Date only | 3 | 1000-01-01 | 9999-12-31 | | DATETIME | Date + time | 8 | 1000-01-01 00:00:00 | 9999-12-31 23:59:59 | | TIMESTAMP | Date + time | 4 | 1970-01-01 00:00:00 | 2038-01-19 03:14:17 | | YEAR | Year only | 1 | 1901 | 2155 | | TIME | Time only | 3 | -838:59:59 | 838:59:59 |
Dates, years, and times Based on this …
[Read more]One of the hidden gems in the MySQL documentation is this note in section 8.3.6: As an alternative to a composite index, you can introduce a column that is “hashed” based on information from other columns. If this column is short, reasonably unique, and indexed, it might be faster than a “wide” index on many columns. We will build on this idea by creating generated hash columns for indexed lookups on large values and enforcing uniqueness across many columns. Instead of creating huge composite indexes, we'll index the compact generated hashes for fast lookups. Before diving into generated hash columns, let's look at generated columns in general. Generated columns in MySQL A generated column can be considered a calculated, computed, or derived column. It is a column whose value results from an expression rather than direct data input. The expression can contain literal values, built-in functions, or references to other columns. The result of …
[Read more]When working with MySQL (or any database!), it's essential to understand how indexes work and how they can be used to improve the efficiency of queries. An index is a separate data structure that maintains a copy of part of your data, structured to allow quick data retrieval. Usually, this structure is a B+ Tree. We have an entire post on how indexes work if you want to go into greater detail. Obfuscated indexes Creating indexes is only part of the battle. You must also know how to write your queries so that you allow MySQL to use your indexes. One common mistake people make when writing queries is that they obfuscate their indexes. Obfuscating an index simply means that you're hiding the indexed value from MySQL. Let's say you have a todos table with a created_at column that records a timestamp of when the record was created.CREATE TABLE `todos` ( `id` int NOT NULL AUTO_INCREMENT, `title` varchar(255) NOT NULL, `created_at` timestamp NOT NULL …
[Read more]