|Showing entries 1 to 30 of 61||Next 30 Older Entries|
If you use Amazon Elastic Compute Cloud (EC2), you are always given choices of AMIs (by default; there are plenty of other AMIs available for your base-os): Amazon Linux AMI, Red Hat Enterprise Linux, SUSE Enterprise Server and Ubuntu. In terms of cost, the Amazon Linux AMI is the cheapest, followed by SUSE then RHEL.
I use EC2 a lot for testing, and recently had to pay a “RHEL tax” as I needed to run a RHEL environment. For most uses I’m sure you can be satisfied by the Amazon Linux AMI. The last numbers suggest Amazon Linux is #2 in terms of usage on EC2.
Anyway, recently Amazon Linux AMI came out with the 2014.03 release (see release[Read more...]
The Severalnines team is pleased to announce the release of ClusterControl 1.2.4. This release contains key new features along with performance improvements and bug fixes.
We have outlined some of the key features below. For additional details about the release:[Read more...]
A little while ago I blogged about (and open sourced) an Impala-powered soccer visualization demo, designed to demonstrate just how responsive Impala queries can be. Since not everyone has the time or resources to run the project themselves, we’ve decided to host it ourselves on an EC2 instance. You can try the visualization; we’ve also opened up the Impala web interface, where you can see query profiles and performance numbers, and Hue (username and password are both ‘test’), where you can run your own queries on the dataset.
We are doing a migration from Amazon RDS to EC2 with a customer. This, unfortunately, involves some downtime – if you are an RDS user, you probably know you can’t replicate an RDS instance to an external server (or even EC2). While it is annoying, this post isn’t going to be a rant on how RDS can make you feel locked in. Instead, I wanted to give you a quick tip.
So here’s the thing – you can’t stop replication on RDS read replica, because you don’t have (and won’t get) privileges to do that:
replica> STOP SLAVE; ERROR 1045 (28000): Access denied for user 'usr'@'%' (using password: YES)
Normally, you don’t want to do that, however we wanted to run some pt-upgrade checks before we migrate[Read more...]
Read the original article at Cloud Operations Interview
What does a cloud computing expert need to know? How do you hire a cloud computing expert? Competition for operations & DBAs is fierce, so you’ll want to know how to find the best.
If you’re a systems administrator or ops guy, you may want to prepare for an interview for such a position. Meanwhile, if you’re a director of it or operations, a recruiter or manager in HR, you’ll want to have some idea how to find the right candidate.
Here’s my guide to do just that. You may[Read more...]
Read the original article at AirBNB didn’t have to fail
Today part of Amazon Web Services failed, taking down with it a slew of startups that all run on Amazon’s Cloud infrastructure. AirBNB was one of the biggest, but also Heroku, Reddit, Minecraft, Flipboard & Coursera down with it. Its not the first time. What the heck happened, and why should we care?
The AWS service allows companies like AirBNB to build web applications, and host them on servers owned and managed by Amazon.[Read more...]
I have seen a few posts on DBA.SE (where I answer a lot of questions) recommending the use of semi-synchronous replication in MySQL 5.5 over a WAN as a way to improve the reliability of replication. My gut reaction was that this is a very bad idea with even a tiny write load, but I wanted to test it out to confirm. Please note that I do not mean to disparage the author of those posts, a user whom I have great respect for.
What is semi-synchronous replication?
The short version is that one slave has to acknowledge receipt of the binary log event before the query returns. The slave doesn’t have[Read more...]
I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, but still the tool serves well for benchmark purposes.
OK, let’s start off with the configuration details.
First of all let me describe the EC2 instance type that I used.
I chose m2.4xlarge instance as that’s the instance type with highest memory available, and memory is what really really matters.
High-Memory Quadruple Extra Large[Read more...]
Ephemeral Nodes in EC2 are good and bad... No guaranteed storage (like EBS, Elastic Block Storage), but you get guaranteed full disk bandwidth, which you can make even better if you RAID0 the disks.
Suppose you built the Cassandra cluster by making every node, but one, an ephemeral node...
And then you set up ONE node, as EBS backed up node (with unpredictable or relatively bad performance).
Then you set up that node to be the seed node for all other nodes, which makes schema management even easier.
On all ephemeral nodes, set up Snitch (in cassandra.yaml) as:
I've been spending some time lately familiarizing myself with EC2, setting up some MySQL servers & clusters here and there, and doing some really basic configuration testing. One situation you'll run into when interacting with EC2 is that it gets unwieldy to use the AWS Management Console web interface for interacting with your instances. There ends up being lots of scrolling, lots of staring, and lots of sighs. Since I'm using SSH to connect to and interact with my instances, I want a reasonable way to find information about them on the Unix command line.
Amazon has an official set of tools [http://aws.amazon.com/developertools/351] that give you this information , at least theoretically. It is some gigantic distribution of shell scripts and Java madness that, if you are very[Read more...]
Vadim Tkachenko of Percona benchmarks Galera versus standalone Percona Server and stock MySQL replication using tpcc-mysql.
Review of Thursday’s Cloud Events in Boston
Everyone is well aware by now of the EC2 outage that Amazon had back in April and it would have surprised no one if that high profile had put a damper on cloud adoption. But judging what we heard yesterday at Boston’s two cloud events (MassTLC’s Cloud Computing Summit and Vilna’s Moving Your Data to the Cloud Panel), cloud solutions can work just fine. For example, there was the customer story told by Douglas Kim, Managing Director, Global Head, PaaS & Cloud Computing at[Read more...]
Amazon EC2 and cloud computing offer great promise for startups to ramp up their online presence quickly. Navigate those challenges with an strong partner. We bring 20 years experience to the table with each new client.
When I spoke at Percona Live (video here) on running an E-commerce database in Amazon EC2, I briefly talked about using RAID 10 for additional performance and fault tolerance when using EBS volumes. At first, this seems counter intuitive. Amazon has a robust infrastructure, EBS volumes run on RAIDed hardware, and are mirrored in multiple availability zones. So, why bother? Today, I was reminded of just how important it is. Please note that all my performance statistics are based on direct experience running a MySQL database on a m2.4xlarge instance and not on some random bonnie or orion benchmark. I have those graphs floating around on my hard drive in glorious 3D and, while interesting, they do not necessarily reflect real-life[Read more...]
NDB cluster is a very interesting solution in term of high availability since there are no single point of failure. In an environment like EC2, where a node can disappear almost without notice, one would think that it is a good fit.
It is indeed a good fit but reality is a bit trickier. The main issue we faced is that IPs are dynamic in EC2 so when an instance restarts, it gets a new IP. What the problem with a new IP? Just change the IP in the cluster config and perform an rolling restart! no? In fact this will not work, since the cluster is already in degraded mode, restarting the surviving node of the degraded node group (NoOfReplicas=2) will cause the NDB cluster to shutdown.
This can be solved by using host names instead of IPs in the config.ini file. What needs to be done is to define, in /etc/hosts, on entry per cluster member.[Read more...]
My list of reasons for never using or recommending Amazon’s MySQL RDS service grows every time I experience problems with customers. This was an interesting and still unresolved issue.
ERROR 126 (HY000): Incorrect key file for table '/rdsdbdata/tmp/#sql_5b7_1.MYI'; try to repair it
You may see this is a MyISAM table. The MySQL database is version 5.5, all InnoDB tables and is very small 100MB in total size.
What is happening is that MySQL is generating a temporary table, and this table is being written to disk. I am unable to change the code to improve the query causing this disk I/O.
What I can not understand and have no ability to diagnose is why this error occurs sometimes and generally when the database is under additional system load. With RDS you have no visibility of the server running the production database. While you have SQL[Read more...]
There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here on out. Shard-Query is very flexible in it’s configuration, so you can use this sample database to spread processing over up to 20 nodes.
The Infobright Community Edition (ICE) images are available in 32 and 64 bit varieties. Due to memory requirements, the InnoDB versions are only available on 64 bit instances. MySQL will fail to start on a micro instance, simply decrease the values in the /etc/my.cnf file if you really want to try micro instances.[Read more...]
In the press in the last two days has been the reported outage of Amazon Web Services Elastic Compute Cloud (EC2) in just one North Virginia data center. This has affected many large website includes FourSquare, Hootsuite, Reddit and Quora. A detailed list can be found at ec2disabled.com.
For these popular websites was this avoidable? Absolutely.
Basic scalability principles if deployed in these systems architecture would have averted the significant downtime regardless of your development stack. While I work primarily in MySQL these principles are not new, nor are they complicated, however they are fundamental concepts in scalability that apply to any technology[Read more...]
It's been long known that Galera optimistic replication and enterprise-size databases are a match made in heaven. Today we're going to get a little closer to testing this statement.
We'll have look at how Galera can scale out Sysbench OLTP complex 60 million rows workload in EC2. This is a first proper benchmark for 0.8 series and also the first benchmark of MariaDB/Galera port, so I'll start modest, just to see how it goes. I chose m1.large instances with 7.8Gb of RAM for server nodes and c1.xlarge instance for a client - I don't want the client to be a bottleneck.
For comparison I have also measured performance of a stock standalone MariaDB 5.1.55 server. I used the standard my.cnf that comes with MariaDB Debian package with the following alterations:
Amazon's EC2 and its sister S3 service have been indisputable leaders in IaaS for a long while now and GlassFish and more generally J2EE/JavaEE took advantage of it starting in 2008 (see here and here), with documented how-to's and significant production references.
Just yesterday, AWS's Evangelist Jeff Barr announced[Read more...]
I guess they got tired of people sending angry emails about data transfer fees:
“Amazon provides an online calculator to help customers decide whether it makes financial sense to ship data via mail rather than uploading over the Internet. You plug in the number of terabytes, devices, average file size, return shipping information and other factors, and find out how much the data transfer would cost via mail compared to standard Internet uploads.
For example, transferring data from a single device containing 2TB would require 26 hours of data loading time and cost $144.74. Uploading the same amount of data over the Internet would cost $204.80. The calculator does not show how long the Internet transfer would take.”[Read more...]
Piper Jaffray has published a 300+ page study on the cloud computing industry based on a recent survey undertaken of 100 CIOs. Bottom line, cloud computing is expected to grow significantly over the next five years.
Survey respondents expect the mix of cloud computing to escalate strongly to 13.5% in five years. This equates to a five-year CAGR of 19.2%, or 23.9% when we also incorporate IDC’s forecast that total software budgets will grow 4.7% annually. In other words, software spending will grow gradually in the next five years, but the mix of spend allocated to cloud-based applications will likely surge rapidly. Another way to think about the data is that the Cloud Computing market is expected to grow five times as fast as the broader software market: 23.9% vs. 4.7%.
If anything, I think the prediction is conservative and the impact could[Read more...]
So during preparation of XtraDB template for EC2 I wanted to understand what IO characteristics we can expect from EBS volume ( I am speaking about single volume, not RAID as in my previous post). Yasufumi did some benchmarks and pointed me on interesting behavior, there seems several level of caching on EBS volume.
Let me show you. I did sysbench random read IO benchmark on files with size from 256M to 5GB with step 256M. And, as Morgan pointed me, I previously made first write, to avoid first-write penalty:
dd if=/dev/zero of=/dev/sdk bs=1M
for reference script is:PLAIN TEXT CODE:
|Showing entries 1 to 30 of 61||Next 30 Older Entries|