Showing entries 1 to 10 of 73
10 Older Entries »
Displaying posts with tag: aws (reset)
Creating An External Slave For A Live AWS Aurora Instance


When working with Amazon AWS Aurora, there are some steps to consider when trying to get data out of an active Aurora master into a slave, potentially into a EC2 instance or offsite in another data centre. Creating an external mysql to Aurora gives the option to move out of Aurora, or to have the flexibility to move data around as desired. With AWS RDS instances this task is pretty simple because you can do the following :

  1. Create a read replica
  2. Stop the slave process
  3. Capture the positioning
  4. Dump the database

With Aurora it’s a little trickier, because a read replica in Aurora has no slave process. All of the replication is handled on the back end and cannot be controlled. However, setting up an external slave can be done.

Amazon AWS Documentation

In …

[Read more]
Why use provisioned IOPS volumes for AWS databases?

In this blog, we’ll use some test results to look at the rationale for using provisioned IOPS volumes for AWS databases.

One piece of advice you often hear running MySQL, MongoDB or other databases in the AWS EC2 environment is that you should use volumes with provisioned IOPs. This kind of makes sense on the “marketing” level, where provisioned IOPS (io1) volumes are designed for IO-intensive database workloads, while General Purpose (gp2) volumes are not. But if you go to the AWS volume type description, you will find that gp2s are shown to have pretty good IO performance. So where do all these supposed database performance problems for Amazon Elastic Block Store (EBS), with no provisioned IOs, come from?

Here is what I found out running experiments with a beta of …

[Read more]
AWS Aurora Benchmarking part 2

Some time ago, I published the article on AWS Aurora Benchmarking (AWS Aurora Benchmarking – Blast or Splash?), in which I analyzed the behavior of different solutions using synchronous replication in AWS environment. This blog follows up with some of the comments and suggestions I received regarding that post from the community and Amazon engineers.

I decided to perform another round of tests, keeping in mind comments and suggestions received.

I presented some of the results during the Percona conference in Santa Clara last April 2016. The following is the transposition that presentation, with more details.

Not interested in the preliminary descriptions? Go to the results section

Why new tests?

[Read more]
MySQL Benchmark in the Cloud


Testing functionalities and options for a database can be challenging at times, as a live production environment might be required. As I was looking for different options, I was directed by Derek Downey to this post in the Percona blog.

The blog discussed an interesting and fun tool from Percona, tpcc-mysql. I was interested in testing the tool so I decided to play around with it in an AWS EC2 server.

In this post I will expand on the Percona blog post, since the tool lacks documentation, as well as explain how I used it to create a MySQL Benchmark in AWS.

Why tpcc-mysql?

There are various reasons why tpcc-mysql could be a good …

[Read more]
5 core pieces of the Amazon Cloud puzzle to get your project off the ground

One of the most common engagements I do is working with firms in and around the NYC startup sector. I evaluate AWS infrastructures & applications built in the Amazon cloud. Join 32,000 others and follow Sean Hull on twitter @hullsean. I’ve seen some patterns in customers usage of Amazon. Below is a laundry list of … Continue reading 5 core pieces of the Amazon Cloud puzzle to get your project off the ground →

[Read more]
When hosting data on Amazon turns bloodsport

There’s a strong trend to automation across the cloud. That’s a great thing for startups because it reduces operational headaches & lets them focus on building products. Join 31,000 others and follow Sean Hull on twitter @hullsean. But as that trend begins to touch the database tier, all sorts of complications emerge. Let’s take a … Continue reading When hosting data on Amazon turns bloodsport →

MySQL performance optimization: 50% more work with 60% less latency variance

When I joined Pinterest, my first three weeks were spent in Base Camp, where the newest engineering hires work on real production issues across the entire software stack. In Base Camp, we learn how Pinterest is built by building it, and it’s not uncommon to be pushing code and making meaningful contributions within just a few days. At Pinterest, newly hired engineers have the flexibility to choose which team they’ll join, and working on different parts of the code as part of the Base Camp experience can help with this decision. Base Campers typically work on a variety of tasks, but my project was a deep dive into a MySQL performance optimization project.

Pinterest, MySQL and AWS, oh my!

We work with MySQL running entirely inside Amazon Web Services (AWS). Despite using fairly high-powered instance types with RAID-0 SSDs and a fairly simple workload (many point selects by PK or simple ranges) that peaks around 2,000 …

[Read more]
Auditing MySQL with McAfee and MongoDB

Greetings everyone! Let’s discuss a 3rd Party auditing solution to MySQL and how we can leverage MongoDB® to make sense out of all of that data.

The McAfee MySQL Audit plugin does a great job of capturing, at low level, activities within a MySQL server. It does this through some non-standard APIs which is why installing and configuring the plugin can be a bit difficult. The audit information is stored in JSON format, in a text file, by default.

There is 1 JSON object for each action that takes place within MySQL. If a user logs in, there’s an object. If that user queries a table, there’s an object. Imagine 1000 active connections from an application, each doing 2 queries per second. That’s 2000 JSON objects per second being written to the audit log. After 24 hours, that would be almost 173,000,000 audit entries!

How does one make sense of that many JSON objects? One option would be to write your own parser in …

[Read more]
fsfreeze in Linux

The fsfreeze command, is used to suspend and resume access to a file system. This allows consistent snapshots to be taken of the filesystem. fsfreeze supports Ext3/4, ReiserFS, JFS and XFS.

A filesystem can be frozen using following command:

# /sbin/fsfreeze -f /data

Now if you are writing to this filesystem, the process/command will be stuck. For example, following command will be stuck in D (UNINTERUPTEBLE_SLEEP) state:

# echo “testing” > /data/file

Only after the filesystem is unfreezed using the following command, can it continue:

# /sbin/fsfreeze -u /data

As per the fsfreeze main page, “fsfreeze is unnecessary for device-mapper devices. The device-mapper (and LVM) automatically freezes filesystem on the device when a snapshot creation is requested.”

fsfreeze is provided by the util-linux package in RHEL systems. Along with userspace support, fsfreeze also …

[Read more]
Licensing Oracle in a public cloud: the CPU calculation impact

First of all a disclaimer: I don’t work for Oracle nor do I speak for them. I believe this information to be correct, but for licensing questions, Oracle themselves have the final word.

With that out of the way, followers of this blog may have seen some of the results from my testing of actual CPU capacity with public clouds like Amazon Web Services, Microsoft Azure, and Google Compute Engine. In each of these cases, a CPU “core” was actually measured to be equivalent to an x86 HyperThread, or half a physical core. So when provisioning public cloud resources, it’s important to include twice as many CPU cores as the equivalent …

[Read more]
Showing entries 1 to 10 of 73
10 Older Entries »