Please make it descriptive, graphic, and if anything burnt or exploded I'd love to have pictures.
Include an approximate timeline of when things happened and when it was all working again (if ever).
This somewhat relates to the earlier post A SAN is a single point-of-failure, too
. Somehow people get into scenarios where highly virtualised environments with SANs get things like replication and everything, but it all runs on the same hardware and SAN backend. So if this admittedly very nice hardware fails (and it will!), the degree of "we're stuffed" is particularly high. The reliance in terms of business processes is possibly a key factor there, rather than purely technical issues.
Anyway, if you have good stories of (distributed?) SAN and VM infra failure, please step up and tell all. It'll help prevent [Read more...]