As I woke up early morning I read news about the Azure South Central US Datacenter outage caused by cooling problems.
On Linkedin I saw multiple posts from Azure experts that DR should have been implemented for the applications and databases. There were multiple emails from customers in America and Europe asking for Disaster Recovery solutions for their application hosted in Azure.
After going through this deluge of Linkedin posts and emails I finally reread the article on outage in ZDNet & Geekwire. I also got some information from my insider’ss network. The cooling problem affected only one of the datacenters. Azure South Central US region has multiple physically separate datacenters that do not share infrastructure like Networking, Grid Power Supply and Cooling.
Cooling Problem is a localized problem and requires a High Availability solution and not a DR solution. The best solution to this problem was to use Availability Zones (AZ) but unfortunately South Central US region does not support Availability Zones (AZ). This problem could have been easily avoided if customers had deployed their application in different AZ and load balanced them, but this region did not supported AZ.
Figure below shows a load balanced application deployed across 3 Availability Zones in a region. These AZs are separate Azure Data Centers.
Since this was a datacenter outage, a HA solution should have been implemented and not a DR solution. HA should be used to mitigate against an outage at a particular datacenter. DR should be used when an entire region goes down.