Microsoft’s Azure cloud services were impacted for several hours on Friday across Japan due to a cooling issue at a data center managed by a third party.
Azure components affected included virtual machines and any services that depend on Azure Storage including API management, backup, site recovery, IoT Hub, SQL database, Azure Machine Learning and Azure Notification Hub.
On Microsoft Azure status page, the company revealed suspected causes of the outage, as per an initial investigation.
It said: “One RUPS system (a rotary uninterruptible power supply system) failed in a manner which resulted in the power distribution that feeds all of the air handler units (AHU) in Japan East data center to fail. The result of the air handlers failing was continued increase in temperature within the entire data center.”
With cooling having failed, “some resources were automatically shutdown to avoid overheating and ensure data integrity and resilience.”
Third party vendor teams and Microsoft’s site services personnel restarted the cooling system air handlers, “using outside airflow to force cool the data center.”
The cooling system was designed for N+1 redundancy, and the power distribution design was running at N+2, the company said.
The RUPS system unit has been sent off for analysis, Microsoft added, apologizing for any downtime or issues customers may have experienced.
Earlier that week, Japan West users ran into problems when using virtual machines, due to a newly built storage scale unit in one data center being assigned IP addresses that should have only been used in another facility.
Microsoft originally launched its cloud in Japan in 2014 with two data centers - one, Japan East, in the Saitama Prefecture near Tokyo, and the other, Japan West, in the Osaka Prefecture.