Showing entries 11 to 12
« 10 Newer Entries
Displaying posts with tag: disaster (reset)
Can I have your horror-stories, please? (SANs and VMs)

Please make it descriptive, graphic, and if anything burnt or exploded I'd love to have pictures.
Include an approximate timeline of when things happened and when it was all working again (if ever).
Thanks!

This somewhat relates to the earlier post A SAN is a single point-of-failure, too. Somehow people get into scenarios where highly virtualised environments with SANs get things like replication and everything, but it all runs on the same hardware and SAN backend. So if this admittedly very nice hardware fails (and it will!), the degree of "we're stuffed" is particularly high. The reliance in terms of business processes is possibly a key factor there, rather than purely technical issues.

Anyway, if you have good stories of (distributed?) SAN and VM infra failure, please step up and tell all. It'll help prevent similar issues for …

[Read more]
MySQL Conference Liveblogging: Disaster Is Inevitable - Are You Prepared? (Tuesday 4:25PM)
  • Suicide
    • having no backups
    • depending on slaves for backup
    • keeping backups on same SAN
    • having a single DBA - Frank didn't like this one at all
    • not keeping binlogs
  • Restoring from backup
    • how much time?
    • uncompressed backup ready to mount?
    • separate network for recovery?
  • In Fotolog, 1TB of data was severely hit.
    • first problem: backup was highly compressed (tar.gz)
    • uncompressing took hours
    • so keep uncompressed backups (at least last N days)
    • it should be mountable, rather than transferable
  • Frank going over recovery modes at …
[Read more]
Showing entries 11 to 12
« 10 Newer Entries