Showing entries 11 to 20 of 71
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: ec2 (reset)
Deploying Cloudera Impala on EC2 with Example Live Demo

A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance. You can try the visualization; we’ve also opened up the Impala web interface, where you can see query profiles and performance numbers, and Hue (username and password are both ‘test’), where you can run your own queries on the dataset.

Deploying Impala on EC2

While there are …

[Read more]
How to STOP SLAVE on Amazon RDS read replica

We are doing a migration from Amazon RDS to EC2 with a customer. This, unfortunately, involves some downtime – if you are an RDS user, you probably know you can’t replicate an RDS instance to an external server (or even EC2). While it is annoying, this post isn’t going to be a rant on how RDS can make you feel locked in. Instead, I wanted to give you a quick tip.

So here’s the thing – you can’t stop replication on RDS read replica, because you don’t have (and won’t get) privileges to do that:

replica> STOP SLAVE;
ERROR 1045 (28000): Access denied for user 'usr'@'%' (using password: YES)

Normally, you don’t want to do that, however we wanted to run some pt-upgrade checks before we migrate and for that we needed the read replica to stop replicating. Here’s one way to do it:

WARNING! …

[Read more]
Cloud Operations Interview

Read the original article at Cloud Operations Interview

What does a cloud computing expert need to know? How do you hire a cloud computing expert? Competition for operations & DBAs is fierce, so you’ll want to know how to find the best.

If you’re a systems administrator or ops guy, you may want to prepare for an interview for such a position. Meanwhile, if you’re a director of it or operations, a recruiter or manager in HR, you’ll want to have some idea how to find the right candidate.

Here’s my guide to do just that. You may also jump to …

[Read more]
AirBNB didn’t have to fail

Read the original article at AirBNB didn’t have to fail

Today part of Amazon Web Services failed, taking down with it a slew of startups that all run on Amazon’s Cloud infrastructure. AirBNB was one of the biggest, but also Heroku, Reddit, Minecraft, Flipboard & Coursera down with it. Its not the first time. What the heck happened, and why should we care?

1. Root Cause

The AWS service allows companies like AirBNB to build web applications, and host them on servers owned and managed by Amazon. The so-called raw iron of this army of compute power sits in datacenters. Each datacenter is …

[Read more]
Performance of MySQL Semi-Synchronous Replication Over High Latency Connections

I have seen a few posts on DBA.SE (where I answer a lot of questions) recommending the use of semi-synchronous replication in MySQL 5.5 over a WAN as a way to improve the reliability of replication. My gut reaction was that this is a very bad idea with even a tiny write load, but I wanted to test it out to confirm. Please note that I do not mean to disparage the author of those posts, a user whom I have great respect for.

What is semi-synchronous replication?

The short version is that one slave has to acknowledge receipt of the binary log event before the query returns. The slave doesn’t have to execute it before returning control so it’s still an asynchronous commit. …

[Read more]
Benchmarking single-row insert performance on Amazon EC2

I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, but still the tool serves well for benchmark purposes.

OK, let’s start off with the configuration details.

Configuration

First of all let me describe the EC2 instance type that I used.

EC2 Configuration

I chose m2.4xlarge instance as that’s the instance type with highest memory available, and memory is what really really matters.

High-Memory Quadruple Extra Large Instance
68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute …
[Read more]
Disaster recovery node on Ephemeral Cassandra cluster

Ephemeral Nodes in EC2 are good and bad... No guaranteed storage (like EBS, Elastic Block Storage), but you get guaranteed full disk bandwidth, which you can make even better if you RAID0 the disks.

Suppose you built the Cassandra cluster by making every node, but one, an ephemeral node...
And then you set up ONE node, as EBS backed up node (with unpredictable or relatively bad performance).
Then you set up that node to be the seed node for all other nodes, which makes schema management even easier.

On all ephemeral nodes, set up Snitch (in cassandra.yaml) as:

read more

Unindexed queries can be really expensive

The story happened with a webshop application running on Amazon EC2 microinstances. Actually on two instance. Amazon business model is basically simple, they ask money for only three things: Cpu time, IOPS and network traffic. Everybody (including me) thinks for the first time network traffic will be the bottleneck until they got the first bill (it can be even after one year considering the free tier). Actually in this category the IOPS is the most expensive.

Symptoms

On the cacti diagrams I saw strange datas. The created temp tables on disk and created temp files were much higher than created temp tables. The 67% of temporary tables were created on disk. This is very far from optimal.

Temporary objects in MySQL

Quick patch

II increased the max_heap_table_size and tmp_table_size from …

[Read more]
Taming the EC2 API

I've been spending some time lately familiarizing myself with EC2, setting up some MySQL servers & clusters here and there, and doing some really basic configuration testing. One situation you'll run into when interacting with EC2 is that it gets unwieldy to use the AWS Management Console web interface for interacting with your instances. There ends up being lots of scrolling, lots of staring, and lots of sighs. Since I'm using SSH to connect to and interact with my instances, I want a reasonable way to find information about them on the Unix command line.

Amazon has an official set of tools [http://aws.amazon.com/developertools/351] that give you this information , at least theoretically. It is some gigantic distribution of shell scripts and Java madness that, if you are very patient, will …

[Read more]
TPC-C like Benchmarks of Galera and Stock MySQL Replication

Vadim Tkachenko of Percona benchmarks Galera versus standalone Percona Server and stock MySQL replication using tpcc-mysql.

Showing entries 11 to 20 of 71
« 10 Newer Entries | 10 Older Entries »