Our electrical and electronic systems are complex and delicate. Complexity can confer massive redundancy and resistance to failure; contrariwise, it can confer single points of failure and fault propagation. In this case, a single bolt of lightning, to a power utility transformer, disrupted power to the Irish servers of Amazon’s Elastic Computing Cloud (EC2). The power surge affected the electricity phase control system that must be running before alternative generators are brought online. With the generators de-synchronized, power remained out. This shows that the parts of Amazon’s data centre are not isolated enough to stand a high-voltage surge.
In the end, it didn’t matter that the servers ran multiple virtual machines. All instances of EC2 were knocked offline for three hours, with gradual recovery after that, extending up to two days more. I hope not too many businesses are depending on it!