Amazon's EC2 and its sister S3 service have been indisputable leaders in IaaS for a long while now and GlassFish and more generally J2EE/JavaEE took advantage of it starting in 2008 (see here and here), with documented how-to's and significant production references. …
- Resolving the contradictions between web services, clouds, and open source
- Defining clouds, web services, and other remote computing
- Why clouds and web services will continue to take over computing
- Why web services should be released as free software (12/20)
- Reaching the pinnacle: truly open web services and clouds (12/22) …
I guess they got tired of people sending angry emails about data transfer fees:
“Amazon provides an online calculator to help customers decide whether it makes financial sense to ship data via mail rather than uploading over the Internet. You plug in the number of terabytes, devices, average file size, return shipping information and other factors, and find out how much the data transfer would cost via mail compared to standard Internet uploads.
For example, transferring data from a single device containing 2TB would require 26 hours of data loading time and cost $144.74. Uploading the same amount of data over the Internet would cost $204.80. The calculator does not show how long the Internet transfer would take.”
Piper Jaffray has published a 300+ page study on the cloud computing industry based on a recent survey undertaken of 100 CIOs. Bottom line, cloud computing is expected to grow significantly over the next five years.
Survey respondents expect the mix of cloud computing to escalate strongly to 13.5% in five years. This equates to a five-year CAGR of 19.2%, or 23.9% when we also incorporate IDC’s forecast that total software budgets will grow 4.7% annually. In other words, software spending will grow gradually in the next five years, but the mix of spend allocated to cloud-based applications will likely surge rapidly. Another way to think about the data is that the Cloud Computing market is expected to grow five times as fast as the broader software market: 23.9% vs. 4.7%.
If anything, I think the prediction is conservative and the impact could be much larger in magnitude when mainstream …[Read more]
Just got the following in my email this morning. I sure wish they had done this earlier. “Free Inbound Data Transfer (until June 30, 2010) Data Transfer into AWS will be free of charge from now through June 30, 2010, making it even easier for customers to get their data into AWS. This applies to [...]
So during preparation of XtraDB template for EC2 I wanted to understand what IO characteristics we can expect from EBS volume ( I am speaking about single volume, not RAID as in my previous post). Yasufumi did some benchmarks and pointed me on interesting behavior, there seems several level of caching on EBS volume.
Let me show you. I did sysbench random read IO benchmark on files with size from 256M to 5GB with step 256M. And, as Morgan pointed me, I previously made first write, to avoid first-write penalty:
dd if=/dev/zero of=/dev/sdk bs=1M
for reference script is:
PLAIN TEXT CODE:
- set -u
- set -x …
During preparation of Percona-XtraDB template to run in RightScale environment, I noticed that IO performance on EBS volume in EC2 cloud is not quite perfect. So I have spent some time benchmarking volumes. Interesting part with EBS volumes is that you see it as device in your OS, so you can easily make software RAID from several volumes.
So I created 4 volumes ( I used m.large instance), and made:
RAID0 on 2 volumes as:
mdadm -C /dev/md0 --chunk=256 -n 2 -l 0 /dev/sdj
RAID0 on 4 volumes as:
mdadm -C /dev/md0 --chunk=256 -n 4 -l 0 /dev/sdj /dev/sdk
RAID5 on 3 volumes as:
mdadm -C /dev/md0 --chunk=256 -n 3 -l 5 /dev/sdj /dev/sdk
RAID10 on 4 volumes in two steps:
mdadm -v --create /dev/md0 --chunk=256 --level=raid1
The question "what problems will I have when migrating to the cloud" gets asked often enough. If by cloud you mean Amazon EC2, then from a technical perspective there isn't much that changes. The biggest thing that changes is just how you pay your bill.
Having said that, there's still a few potential gotchas:
There are no Virtual IP addresses. Most High
Availability tools (like MMM or DRBD+Heartbeat)
work on the principal of having a floating IP address which is used for the application to connect to the current master. With EC2, you can't do this.
- There's no customization of the memory. The maximum amount of memory you can have is 15GB, so some users with larger working sets may find this a limitation. If you look at the Dell online store, it costs $2094 to upgrade an R900 from 4G memory to 64G (or $4378 to upgrade to 128G) which …
EC2 is nifty, but it doesn’t appear suitable for all needs, and that’s what this post is about.
For instance, a machine can just “disappear”. You can set things up to automatically start a new instance to replace it, but if you just committed a transaction it’s likely to be lost: MySQL replication is asynchronous, EBS which is slower if you commit your transactions on it, or EBS snapshots which are only periodic (you’d have to add foo on the application end). This adds complexity, and thus the question arises whether EC2 is the best solution for systems where this is a concern.
When pondering this, there are two important factors to consider: a database server needs cores, RAM and reasonably low-latency disk access, and application servers should be near their database server. This means you shouldn’t split app and db servers to different hosting/cloud providers.
We’d like to hear your thoughts on EC2 …[Read more]
Yahoo opens up Hadoop distribution. Microsoft and Novell claim customer wins. And more.
Follow 451 CAOS Links live @caostheory
The elephant in the room
Plenty of news emerged form the Hadoop Summit this week, including Cloudera announced support for Amazon Elastic Block Storage (EBS) and introduced Sqoop, open source tool for importing databases into Hadoop, while Yahoo! Released! The! Yahoo! Distribution! Of! Hadoop! opening up its Hadoop developments to the wider community. As Savio Rodrigues noted, there has been a surge in the number of contributors for the Hadoop project in the last year.
Best of the rest