Summary
The two Microsoft outages in July surprised the world, showing that even reliable software providers can face technical issues. These outages caused significant disruptions, rendering computers and servers inoperable. While such incidents on this scale may be rare, they still provide valuable lessons in cybersecurity resilience, highlighting the importance of ensuring system availability, diversifying IT suppliers, and conducting thorough testing in controlled environments. These insights can also be applied to sectors like banking in Singapore, helping to co-create secure digital finance frameworks for the future.
Introduction
Described by Elon Musk as the “biggest IT fail ever”, Microsoft’s outage on July 19, 2024 shocked the world and caused unimaginable disruptions. Computers and servers were brought to a halt, leading to the cancellation of as many as 4000 flights in the US alone, and affecting around 3700 doctors’ practices in the UK. Major financial institutions, supermarkets, transport services, among other organisations were unable to resume normal operations.
When even one of the largest IT vendors can suffer from system failures, this underscores the critical need for robust cybersecurity systems – no organisation is immune to disruptions in our increasingly digital world. How can companies ensure their systems are truly resilient? What specific lessons can the banking sector draw from these high-profile failures to better safeguard their operations?
What were the outages about?
The culprit of the July 19 outage was cybersecurity firm CrowdStrike. Its Falcon platform is integrated in Microsoft Windows to provide security protection by monitoring potential threats in real-time. There was a logic error in a sensor configuration update for Falcon, yet its Content Validator component responsible for verifying the integrity of rapid response content update was faulty and was therefore unable to detect this logic error. As a result, Falcon malfunctioned, leading to a crash of the Windows systems.
The outage affected 8.5 million Windows devices worldwide, with many of which being crucial to many operations and ultimately disrupting numerous industries. According to estimates from insurer Parametrix, the incident resulted in global financial losses of approximately USD 15 billion, with 123 Fortune 500 companies (excluding Microsoft) accounting for 5.4 billion of the total.
Less than two weeks later, Microsoft was hit by another outage. This time, it was due to a distributed denial of service (DDoS) cyberattack that has yet to be linked to a specific threat actor. A DDoS attack occurs when an unexpected usage spike floods the server with more requests than it can handle. Microsoft already had DDoS protection in place, but it was misconfigured and ended up amplifying the DDoS attack. As a result, users were not able to access various multiple Microsoft services for around 10 hours.
Redesigning cybersecurity systems
The Microsoft outages highlighted the vulnerabilities inherent in cybersecurity systems and the importance of innovating current cybersecurity practices to help organisations better prepare for large-scale systems failures.
- Ensuring systems availability
Both incidents demonstrated the importance of systems availability. Organisations need to prioritise high availability in their systems, ensuring that they can maintain operations even in emergencies. This may involve implementing failover systems, redundant infrastructure, and back-up systems.
For instance, there were limited impacts from Microsoft’s outage on July 19 in China. That’s not only because of China’s aim to establish technological independence and reduce reliance on US-based IT services vendors in light of the US-initiated technology war against China. The nation’s early initiative to establish substitution plans for domestic hardware, operating systems and application software, also played a huge role in limiting the ripple impact of the outage.
- Diversifying IT suppliers
We also learnt about the risks associated with depending heavily on a single IT vendor. While large vendors offer much more attractive pricing due to their economies of scale, and their industry status can foster significant customer trust, organisations must learn to balance this trust with an understanding of the inherent risks involved, for example, through opting for a diversification in IT suppliers.
A case in point is Yahoo. It relied heavily on Symantec for various security solutions before experiencing significant data breaches that impacted billions of user accounts in 2013 and 2014. After being acquired by Verizon in 2017, Yahoo systematically revamped its security practices and diversified its IT vendor portfolio, including Crowdstrike for advanced threat detection and Okta for access management.
That being said, enterprises should also strike a balance between relying on a single vendor and over-diversification, which can lead to highly complex systems that are hard to manage effectively.
- Robust testing in controlled environments before launching to production
We also see the importance of testing and updating protocols in controlled environments. A lack of comprehensive testing, as was the case with CrowdStrike, can lead to unintended consequences and system failures. Organisations should prioritise thorough testing and gradual implementation to mitigate risks and ensure the stability of their cybersecurity systems.
For example, Apple has a Beta Software program where developers are recruited to test and provide feedback on its softwares like iOS, iPadOS and macOS. This enables Apple to address loopholes before officially launching any new software updates.
Cybersecurity insights for banking in Singapore
Recent disruptions in banking services, such as those experienced by Singapore’s DBS Bank and POSB, highlight the critical need for robust cybersecurity measures particularly in the financial sector. These incidents underscore the vulnerability of banking and payment services, emphasising the necessity for secure and reliable frameworks as cashless transactions become increasingly prevalent in our digital economy.
- Zero trust security model
To safeguard sensitive financial data from both internal and external threats, adopting a zero trust security model could be helpful. This approach operates on the principle that no one is trusted by default, and access to data and systems is granted only after rigorous verification processes. DBS Bank and UOB are taking the lead and already transitioning to the zero trust security architecture. As other financial institutions follow suit, this collective effort will enhance the overall security framework of Singapore’s financial ecosystem.
- Data management and compliance
As a central monetary regulatory body, the Monetary Authority of Singapore regularly assesses data governance and management frameworks of selected banks. Moving forward, it emphasised on the importance of establishing detailed data governance frameworks to hedge against the risks of leaking confidential data in case of unexpected threats like cybersecurity attacks and outages.
Advanced Threat Detection
Threat detection is vital for identifying cybersecurity loopholes, so that organisations could take precautionary actions rather than reactive, correctionary ones. Leveraging technologies like artificial intelligence and machine learning can significantly enhance the ability to detect and respond to threats, and subsequently integrate these advanced tools with existing fintech products to further bolster the security of banking systems. For instance, DBS Consumer Banking Group has been actively leveraging AI to send out 45 million personalised nudges to guide 5 million customers towards better investment and financial planning decisions monthly.
Conclusion
In an era where cyber threats are increasingly varied and frequent, a reactive approach is no longer sufficient. It is crucial for organisations to pivot towards a proactive stance that emphasises systems resilience through diversified IT vendor portfolios, rigorous testing protocols, and a commitment to continuous improvement. This imperative extends beyond the finance sector; all industries involving a digital element should embrace innovation and collaboration to safeguard their operations, enhance trust and confidence among stakeholders, and ultimately pave the way for a safer and more reliable digital future.
Disclaimer: The views and opinions expressed in this article are solely those of the author and do not reflect the official policy or position of the National University of Singapore (NUS) or the NUS FinTech Lab.