- Delta Air Lines CEO Ed Bastian said the massive IT outage earlier this month that stranded thousands of customers will cost it $500 million.
- The airline canceled more than 4,000 flights in the wake of the outage, which was caused by a botched CrowdStrike software update and took thousands of Microsoft systems around the world offline.
- Bastian, speaking from Paris, told CNBC’s “Squawk Box” on Wednesday that the carrier would seek damages from the disruptions, adding, “We have no choice.”
Not for the rapid update that broke everything.
See post incident report:
How Do We Prevent This From Happening Again?
Software Resiliency and Testing
-
Improve Rapid Response Content testing by using testing types such as:
-
Local developer testing
-
Content update and rollback testing
-
Stress testing, fuzzing and fault injection
-
Stability testing
-
Content interface testing
-
Add additional validation checks to the Content Validator for Rapid Response Content.
-
A new check is in process to guard against this type of problematic content from being deployed in the future.
-
Enhance existing error handling in the Content Interpreter.
Rapid Response Content Deployment
-
Implement a staggered deployment strategy for Rapid Response Content in which updates are gradually deployed to larger portions of the sensor base, starting with a canary deployment.
-
Improve monitoring for both sensor and system performance, collecting feedback during Rapid Response Content deployment to guide a phased rollout.
-
Provide customers with greater control over the delivery of Rapid Response Content updates by allowing granular selection of when and where these updates are deployed.
-
Provide content update details via release notes, which customers can subscribe to.
Source: https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/