The outage that occurred on August 30 led to downtime in Azure services pertaining to APIs, databases, and applications. Credit: Getty Images Microsoft has blamed staff strength and failed automation for a data center outage in Australia that took place on August 30, disabling users from accessing Azure, Microsoft 365, and Power Platform services for over 24 hours. In a post-incident analysis report, Microsoft said the outage occurred due to a utility power sag in Australia’s East region, which in turn “tripped a subset of the cooling units offline in one data center, within one of the Availability Zones.” As the cooling units were not working properly, the rise in temperature forced an automated shutdown of the data center in order to preserve data and infrastructure health, affecting compute, network, and storage services. However, Microsoft said that the cooling units could have been restarted manually, which was not possible due to the unavailability of enough personnel at the data center. “Due to the size of the data center campus, the staffing of the team at night was insufficient to restart the chillers in a timely manner. We have temporarily increased the team size from three to seven, until the underlying issues are better understood and appropriate mitigations can be put in place,” Microsoft wrote as part of the report. In addition, the company said it is working on other major reforms, such as improving existing automation for the data center to improve restoration of services when an incident occurs. “We are exploring ways to improve existing automation to be more resilient to various voltage sag event types,” Microsoft said, adding that an evaluation was underway to ensure that the highest-load servers and their corresponding chillers restarted first. In the past few months, Microsoft has reported several outages, especially the unavailability of M365 services. In July, an outage took out its OneDrive for Business and SharePoint Online services. In June, users faced issues with Outlook Web, Teams, OneDrive for Business, and SharePoint for over eight hours. In May, the company reported that UK users were facing issues accessing some service offerings under Microsoft 365. In April, Microsoft said it was investigating an issue where certain users were unable to use the search functionality in multiple Microsoft 365 services. Outlook on the Web, Exchange Online, SharePoint Online, Microsoft Teams, and Outlook desktop clients were among the affected services. In another incident in April, users could not access Microsoft 365 web applications, and Teams. Microsoft also suffered a global outage in February, and yet again, its users could not access emails and Teams. It suffered a similar outage in January. Related content news Supermicro unveils AI-optimized storage powered by Nvidia New storage system features multiple Nvidia GPUs for high-speed throughput. By Andy Patrizio Oct 24, 2024 3 mins Enterprise Storage Data Center news Nvidia to power India’s AI factories with tens of thousands of AI chips India’s cloud providers and server manufacturers plan to boost Nvidia GPU deployment nearly tenfold by the year’s end compared to 18 months ago. By Prasanth Aby Thomas Oct 24, 2024 5 mins GPUs Artificial Intelligence Data Center news Gartner: 13 AI insights for enterprise IT Costs, security, management and employee impact are among the core AI challenges that enterprises face. By Michael Cooney Oct 23, 2024 6 mins Generative AI Careers Data Center news Network jobs watch: Hiring, skills and certification trends What IT leaders need to know about expanding responsibilities, new titles and hot skills for network professionals and I&O teams. By Denise Dubie Oct 23, 2024 33 mins Careers Data Center Networking PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe