Showing entries 1 to 7
Displaying posts with tag: pt-archiver (reset)
Using pt-online-schema-change for Purging Rows and Reclaiming Disk Space in a Single Operation

You probably missed the news, but… PT-1751: Adds –where param to pt-online-schema-change This brings the possibility to perform what I would call an “inverted purge” because you are not actually purging rows from your multi-terabyte table, but rather, you copy the small percentage of rows you want to keep to a new table and then […]

MySQL Data Archival With Minimal Disruption

We all know that data is important, and some businesses need historical data to be available all the time. The problem is that queries on large tables perform poorly if they are not properly optimized. We get many customer requests in Managed Services to purge/archive large tables, and to achieve it, we use pt-archiver.

Recently, we received a request to archive a large table, and the customer was worried about the downtime and performance issues during the archival.

We proposed a solution to the customer to archive the table using pt-archive. The idea is to archive old data to other tables and keep the latest data on the current table with minimal performance issues. All of the data will remain available and can be queried anytime.

In the blog, I will …

[Read more]
pt-archiver Misbehaving With Secondary Index

Not long ago, we found a long-running query in one of our clients on Managed Services. While reviewing this long-running query, we found it was generated by a pt-archiver operation. That doesn’t sound good, right?

I was able to reproduce the issue on my test environment. In this article, I will show you how to use pt-archiver to purge data from a table using a secondary index. First, we need to have the full picture before going any deeper. Here is a representation of the table structure:

Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=4000001 DEFAULT …
[Read more]
Rescuing a crashed pt-online-schema-change with pt-archiver

This article discusses how to salvage a crashed pt-online-schema-change by leveraging pt-archiver and executing queries to ensure that the data gets accurately migrated. I will show you how to continue the data copy process, and how to safely close out the pt-online-schema-change via manual operations such as RENAME TABLE and DROP TRIGGER commands. The normal process to recover from a crashed pt-online-schema-change is to drop the triggers on your original table and drop the new table created by the script. Then you would restart pt-online-schema-change. In this case, this wasn’t possible.

A customer recently needed to add a primary key column to a very busy table (with around 200 million rows). The table only had a …

[Read more]
Want to archive tables? Use Percona Toolkit’s pt-archiver

Percona Toolkit’s pt-archiver is one of the best utilities to archive the records from large tables to another tables or files. One interesting thing is that pt-archiver is a read-write tool. It deletes data from the source by default, so after archiving you don’t need to delete it separately.

As it is done by default, you should take care before actually running it on then production server. You can test your archiving jobs with the – dry-run  OR you can use the –no-delete option if you’re not sure about. The purpose of this script is mainly to archive old data from the table without impacting OLTP queries and insert the data into …

[Read more]
5 Percona Toolkit Tools for MySQL That Could Save Your Day: April 3 Webinar

On April 3 at 10 a.m. PST, I’ll be giving a webinar titled “5 Percona Toolkit Tools for MySQL That Could Save Your Day.” In this presentation you’ll learn how to perform typical but challenging MySQL database administration tasks.

My focus will be on the following tools:

  • pt-query-digest, to select the queries you should try to improve to get optimal response times
  • pt-archiver, to efficiently purge purge data from huge tables
  • pt-table-checksum/pt-table-sync, to check if data on replicas is in sync with data on the master
  • pt-stalk, to gather data when performance problems happen randomly or are very short
  • pt-online-schema-change, to run …
[Read more]
Few words about pt-archiver

I really like the percona toolkit, we all love the percona toolkit.
I know how it’s difficult to write operational and efficient scripts (I try to do that myself everyday)
And it is even more difficult to share a script, to take the responsibility to share its own code.
From there, understand that this article is simply a review of my own thoughts about pt-archiver (with the invaluable assistance of @maximefouilleul), I don’t want to question the quality or usefulness of this tool.

I tried pt-archiver for the first time this week, and the first thing I do before using a tool is read the documentation (yes, I really like to read documentations)

I was intrigued by some options of this tool, first, I can read “It deletes data from the source by default“.
Personally, I hate that you want to remove my data by default, I prefer to have a –delete option …

[Read more]
Showing entries 1 to 7