The recent CrowdStrike and Microsoft global outages perfectly encapsulate just how fragile Internet services and communications systems truly are, and illustrate how one seemingly small error can have a cascading impact across sectors and economies.
Events like these also highlight the telecom sector’s vulnerabilities. With climate change-fueled harsh weather now four times more likely to occur in the UK today than in the 1970s, this is another serious area of risk that the sector needs to consider.
Mobile communications act as a crucial early warning system – essential if an effective emergency response is to be organized – yet, the National Audit Office (NAO) has warned that the UK is not currently adequately prepared to deal with climate disasters.
The report highlights that a system is only as strong as its weakest link. Without adequate planning and preparation from mobile network operators (MNOs), the impact of calamitous events and outages could be much worse.
Network resilience strategies, or efforts to maintain continuous connectivity and minimize outages, are essential.
The state of UK resilience
In late 2023, UK telecoms regulator Ofcom proposed updated guidance to shore up the resilience of telecoms networks, stating that communications providers have a “legal obligation to identify, prepare for and reduce the risk of anything that compromises the availability, performance or functionality of their network or service.”
The full updated guidance is due later this year, and is expected to include new advice that encourages MNOs to avoid a single point of failure in their networks, and to have automatic failover functionality in place to divert traffic to another site when equipment or infrastructure fails.
This need to avoid a single point of failure is particularly relevant in the wake of the recent CrowdStrike and Microsoft outages, which are sure to be in the minds of Ofcom regulators.
According to Downdetector data, what was supposed to be a routine software update by security firm CrowdStrike ended up being one of the largest tech outages in IT history, with outage reports surging across a variety of sectors. Microsoft alone saw 42 times the amount of outage reports on the site than its typical daily average.
Ofcom’s new guidance signals to operators that the resilience of UK telecoms networks is not currently up to a sufficient standard.
While smaller outages can be an annoyance for customers and major ones can be incredibly disruptive, there is an even more serious implication of insufficient network resilience: disaster response. The BBC recently reported that the UK disaster response urgently needs a £100m upgrade.
When disaster strikes, communication is integral: to contact emergency services, access crucial resources, keep in touch with loved ones, or receive vital updates from news organizations or the government.
In the UK, emergency services rely heavily on provider networks, with the Emergency Services Network (ESN) using EE’s 4G network infrastructure. UK law requires networks to take appropriate steps to prepare for potential outages, yet in June 2023, a BT network fault caused 14,000 calls to emergency services to fail over 10.5 hours. BT was ultimately fined by Ofcom for being ill-prepared for the outage, but the event still serves as a stark reminder of the impact of an insufficient MNO network resilience strategy.
Preparing for potential disaster
Ofcom’s updated guidance is sure to be expansive, meaning operators will need to be ready to adapt and update their resilience plans. Cellular analytics and monitoring software can be a vital tool in their arsenal when considering resilience strategies.
Such software is capable of providing MNOs with crucial data to identify weak points in their networks before a major outage or disaster strikes. And in times of crisis, mapping, and monitoring tools can help visualize traffic bottlenecks so additional resources can be deployed to alleviate extra strain.
Several telecom industry bodies have recognized the incredibly impactful role these tools play in organizing a disaster response. For example, the International Telecommunication Union (ITU) has created the Disaster Connectivity Maps (DCM), which can increase network preparedness by tracking user movements and identifying areas in need of assistance in an emergency.
Industry body the GSMA operates the Humanitarian Connectivity Charter (HCC), which it calls “a set of principles and best practices reached collaboratively between MNOs on how to prepare for, respond to, and recover from a sudden onset emergency.” The three key principles of the HCC are that disaster recovery can be enhanced by MNO coordination, scale, and partnerships.
There are also effective lessons that the UK can learn from parts of the world that are at a more constant threat of climate-related disaster.
Recent wildfires in Colorado, US, showcase how effective MNO resilience strategies can protect customers from the most severe impacts of emergency situations.
Thanks to its resilience measures, Verizon already had alternate power sources in place in the region, so was able to keep signal towers running. Users were still able to access communications services, even when they couldn’t turn the lights on at home.
Ultimately, both Ofcom and the GSMA are aligned on operators having an integral role in shoring up the UK’s defenses.
With the CrowdStrike and Microsoft outages putting a spotlight on single points of failure, and the UK’s urgent need for disaster response investment, it is clear that MNOs must ready themselves, and their network resilience strategies, if they’re to be an effective part of the country’s response.