Some users were unable to log into Microsoft's Office 365, Teams, Outlook and other offerings for three hours late on Monday US time, due to a botched upgrade.
Between 2125 UTC and 0023 UTC, Microsoft's Azure Active Directory (Azure AD) suffered an outage, which affected authentication services, preventing logins. Microsoft said the problem was caused by a recent "configuration change". It rolled back that change, and reports that most users should be able to use the services as normal, by around 2.30 UTC.
Despite reports of a global outage, Microsoft's status reports said the outage affected "a subset of customers in the Azure Public and Azure Government clouds." Those already logged in could apparently continue working.
Microsoft's preliminary analysis pointed to a recent configuration change, which impacted a backend storage layer, causing "latency" to authentication requests. In response, the company rolled back the change. This left some users still feeling residual effects for some hours.
Worryingly, the US also suffered an interruption to 911 emergency call services at the same time. Fourteen states suffered problems, and a tweet from the City of Redmond led some to blame the outage on Microsoft cloud's problems. ZDNet reports that the more likely cause for emergency call problems was a problem with PSAPs (Public Safety Answering Points) driven by voice over IP (VoIP).
After early reports, all seems well. Security professional Mike Fowler pointed out in a tweet that in the cloud, the likelihood of problems has decreased - but their impact has gone up because they affect multiple customers. summing up: "Sh*t happened, MS got on it, fixed, job done."
Microsoft has promised a full Post Incident Report (PIR) will be published within the next 72 hours.