A recommendation we often give to our customers is along the lines of “archive old data” to reduce your database size. There is a tradeoff between keeping all our data online and archiving part of it to cold storage.
There could also be legal requirements to keep certain data online, or you might want to query old data occasionally without having to go through the hassle of restoring an old backup.
In this post, we will explore a very useful feature of AWS RDS/Aurora that allows us to export data to an S3 bucket and run SQL queries directly against it.
Archiving Data to S3
Let’s start by describing the steps we need to take to put our data into an S3 bucket in the required format, which is called Apache Parquet.
Amazon states the Parquet format is up to 2x faster to export and consumes up to 6x less storage in S3, compared to other text formats.
1. Create a snapshot of the database (or …[Read more]