Tumblr has been down for more than 12 hours due to an issue with their database cluster. Here is the comment I left on GigaOm.com
This is the freshest lesson for entrepreneurs and startups: - Learn to value your data - Implement a high availability plan - Plan a disaster recovery strategy
“Tumblr likely has the resources to recover…”
I really hope that holds out true but remember, data is the only irreplaceable asset of an organization. Once it’s gone, it’s gone.
When I was handling the disaster at Fotolog (massive database corruption when our SAN crashed), I couldn’t find any company or consulting firm ready to handle the situation and help with data recovery. It was a miracle that I came [Read more...]
“Funny how Amazon doesn't use S3 to store any assets for amazon.com”tweet by @gruber
Amazon's S3 suffered a major outage today knocking many websites offline. S3 outage started at approximately 12:00 PM EST and the last time I checked at 11:11PM EST, Smugmug, a popular photo hosting site that extensively uses S3, was still down.
Disaster is really inevitable. Even with all the redundant power investments, ThePlanet (formerly EV1 and RackShack), had to shut down their backup generators at their H1 data center on the instructions of the fire crew. This happened after a wire-short in fault transformer led to an explosion that knocked off one of their walls, ultimately bringing 9,000 servers down. Luckily no one was injured.
This just goes on to show that just because a data center has redundant power and backup generators, it does not mean that a disaster cannot happen. IIRC, ThePlanet's last disaster was blamed on backup generators not kicking off properly.
While there was no damage to servers, I wonder how many MyISAM repairs need to be triggered once the servers do come back online?
Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.