Delta CEO says CrowdStrike-Microsoft outage cost the airline $500 million(www.cnbc.com)

posted 5 months ago

MicroWave@lemmy.world

news@lemmy.world

134 commentshide report

Delta Air Lines CEO Ed Bastian said the massive IT outage earlier this month that stranded thousands of customers will cost it $500 million.
The airline canceled more than 4,000 flights in the wake of the outage, which was caused by a botched CrowdStrike software update and took thousands of Microsoft systems around the world offline.
Bastian, speaking from Paris, told CNBC’s “Squawk Box” on Wednesday that the carrier would seek damages from the disruptions, adding, “We have no choice.”

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments View context

[ - ]

clstrfck@lemdro.id

17 points

5 months ago

So you think Delta should’ve had a different antivirus/EDR running on every computer?

permalink

report

parent

[ - ]

SulaymanF@lemmy.world

1 point

5 months ago

Alternatively, they could have taken Crowdstrike’s offer of layered rollouts, but Delta declined this and wanted all updates immediately to all devices.

permalink

report

parent

[ - ]

Riskable@programming.dev

2 points

5 months ago

Adding another reply since I went on a bit of a rant in my other one… You’re actually missing the point I was trying to make: No matter what solution you choose it’s still your fault for choosing it. There are a zillion mitigations and “back up plans” that can be used when you feel like you have no choice but to use a dangerous 3rd party tool (e.g. one that installs kernel modules). Delta obviously didn’t do any of that due diligence.

permalink

report

parent

[ - ]

ricecake@sh.itjust.works

3 points

5 months ago

Kernel module is basically the only way to implement this type of security software. That’s the only thing that has system wide access to realtime filesystem and network events.

Yes, they’re ultimately liable to their customers because that’s how liability works, but it’s really hard to argue that they’re at fault for picking a standard piece of software from a leading vendor that functions roughly the same as every piece of software in this space for every platform functions, which then bypassed all configurations they could make to control updates, grabbed a corrupted update and crashed the computer.
It’s like saying it’s the drivers fault the brakes on their Toyota failed and they crashed into someone. Yes, they crashed and so their insurance is going to have to cover it, but you don’t get angry at the driver for purchasing a common car in good condition and having it break in a way they can’t control.

What mitigations should they have had? All computer systems are mostly third party tools. Your OS is a third party tool. Your programming language is a third party tool. Webserver, database, loadbalancer, caching server: all third party tools. Hardware drivers? Usually third party, but USB has made a lot of things more generic.

If your package manager decides to ignore your configuration and update your kernel to something mangled and reboot, your computer is going to crash and it’ll stay down until you can get in there to tell it to stop booting the mangled kernel.

permalink

report

parent

[ - ]

Riskable@programming.dev

1 point

5 months ago

It is absolutely not the only way to implement EDR. Linux has eBPF which is what Crowdstrike and other tools use on Linux instead of a kernel module. A kernel module is only necessary on Windows because Windows doesn’t provide the necessary functionality.

Mitigating factors: Use (and take) regular snapshots and test them. My company had all our virtual desktops restored within half an hour on that day. If you don’t think Windows Volume Shadow Copy is capable or actually useful for that in the real world then you’re making my argument for me! LOL

Another option is to use systems (like Linux) that let you monitor these sorts of EDR things while remaining super locked down. You can run EDR tools on immutable Linux systems! You can’t do that on Windows because (of backwards compatibility!) that OS can’t run properly in an immutable share.

Windows was not made to be secure like that. It’s security contexts are just hacks upon hacks. Far too many things need admin rights (or more privileges!) just to function on a basic level.

OSes like Linux were built to deal with these sorts of things. Linux, specifically, has gone though so many stages of evolution it makes Windows look like a dinosaur that barely survived the asteroid impact somehow.

permalink

report

parent

[ - ]

ricecake@sh.itjust.works

3 points

5 months ago

eBPF, the kernel level tool? Because you need to be in the kernel to have that level of access, which is what I was saying? The one with a bug that crowd strike hit that caused Linux servers to KP?
Yes, I said “kernel module” when I should have said “software executing in a kernel context”. That’s on me.

By the way, eBPF? Third party software by most metrics. Developed and maintained by Facebook, Cisco, Microsoft, Google and friends. Also available on windows, albeit not as deeply integrated due to the layers of cruft you mention.

I’m glad you were able to recover your VMs quickly. How quickly were you able to recover your non-virtualized devices, like laptops, desktops or that poor AD server that no one likes?
Airlines need more than just servers to operate. They also need laptops for various ground crew, terminals for the gate crew and ticketing agents, desktops for the people in offices outside the airport who manage “stuff” needed to keep an airline running.

You seem to be much more interested in talking about Linux being better than windows, which is a statement I agree with, but it’s quite different from your original point that “Delta is at fault because they used third party tools”.

My point was that it’s unreasonable to say that Delta should have known better than to use a third party tool, while recommending Linux (not written by Delta), whose ecosystem is almost entirely composed of different third parties that you need to trust, either via system software (webserver), holding your critical data (database), kernel code (network card makers usually add support by making a kernel patch), or entire architectural subsystems (eBPF was written by a company that sells services that use it, and a good chunk of the security system was the NSA).

None of that bothers me. I just don’t get how it doesn’t bother you if you don’t trust well regarded vendors in kernel space to have those same vendors making kernel patches.

report

[ - ]

2 points

5 months ago

Sounds like they executed their plans just fine.

And due diligence is “the investigation or exercise of care that a reasonable business or person is normally expected to take before entering into an agreement or contract with another party or an act with a certain standard of care”. Having BC/DR plans isn’t part of due diligence.

permalink

report

parent

[ - ]

Riskable@programming.dev

2 points

5 months ago

If I were in charge I wouldn’t put anything critical on Windows. Not only because it’s total garbage from a security standpoint but it’s also garbage from a stability standpoint. It’s always had these sorts of problems and it always will because Microsoft absolutely refuses to break backwards compatibility and that’s precisely what they’d have to do in order to move forward into the realm of, “modern OS”. Things like NTFS and the way file locking works would need to go. Everything being executable by default would need to end and so, so much more low-level stuff that would break like everything.

Aside about stability: You just cannot keep Windows up and running for long before you have to reboot due to the way file locking works (nearly all updates can’t apply until the process owning them “lets go”, as it were and that process usually involves kernel stuff… due to security hacks they’ve added on since WinNT 3.5 LOL). You can’t make it immutable. You can’t lock it down in any effective way without disabling your ability to monitor it properly (e.g. with EDR tools). It just wasn’t made for that… It’s a desktop operating system. Meant for ONE user using it at a time (and one main application/service, really). Trying to turn it into a server that runs many processes simultaneously under different security contexts is just not what it was meant to do. The only reason why that kinda sort of works is because of hacks upon hacks upon hacks and very careful engineering around a seemingly endless array of stupid limitations that are a core part of the OS.

permalink

report

parent

[ - ]

clstrfck@lemdro.id

3 points

5 months ago

I enjoy hating on Windows as much as the next guy who installed Linux on their laptop once, but the bottom line is 90 percent of businesses use it because it does work.

Blaming the people who made the decision to purchase arguably the most popular EDR solution on the planet and use it (those bastards!) does nothing but show a lack of understanding how any business related IT decisions work.

permalink

report

parent

[ - ]

kbin_space_program@kbin.run

6 points

5 months ago

Please go read up on how this error happened.

This is not a backwards compatibility thing, or on Microsoft at all, despite the flaws you accurately point out. For that matter the entire architecture of modern PCs is a weird hodgepodge of new systems tacked onto older ones.

Crowdstrike’s signed driver was set to load at boot, edit: by Crowdstrike.
Crowdstrike’s signed driver was running unsigned code at the kernel level and it crashed. It crashed because the code was trying to read a pointer from the corrupt file data, and it had no protection at all against a bad file.

Just to reiterate: It loaded up a file and read from it at the kernel level without any checks that the file was valid.

As it should, windows treats any crash at the kernel level as a critical issue. and bluescreens the system to protect it.

The entire fix is to boot into safe mode and delete the corrupt update file crowdstrike sent.

permalink

report

parent

News

!news@lemmy.world

Create post

Welcome to the News community!

Rules:

1. Be civil

Attack the argument, not the person. No racism/sexism/bigotry. Good faith argumentation only. This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban. Do not respond to rule-breaking content; report it and move on.

2. All posts should contain a source (url) that is as reliable and unbiased as possible and must only contain one link.

Obvious right or left wing sources will be removed at the mods discretion. We have an actively updated blocklist, which you can see here: https://lemmy.world/post/2246130 if you feel like any website is missing, contact the mods. Supporting links can be added in comments or posted seperately but not to the post body.

3. No bots, spam or self-promotion.

Only approved bots, which follow the guidelines for bots set by the instance, are allowed.

4. Post titles should be the same as the article used as source.

Posts which titles don’t match the source won’t be removed, but the autoMod will notify you, and if your title misrepresents the original article, the post will be deleted. If the site changed their headline, the bot might still contact you, just ignore it, we won’t delete your post.

5. Only recent news is allowed.

Posts must be news from the most recent 30 days.

6. All posts must be news articles.

No opinion pieces, Listicles, editorials or celebrity gossip is allowed. All posts will be judged on a case-by-case basis.

7. No duplicate posts.

If a source you used was already posted by someone else, the autoMod will leave a message. Please remove your post if the autoMod is correct. If the post that matches your post is very old, we refer you to rule 5.

8. Misinformation is prohibited.

Misinformation / propaganda is strictly prohibited. Any comment or post containing or linking to misinformation will be removed. If you feel that your post has been removed in error, credible sources must be provided.

9. No link shorteners.

The auto mod will contact you if a link shortener is detected, please delete your post if they are right.

10. Don't copy entire article in your post body

For copyright reasons, you are not allowed to copy an entire article into your post body. This is an instance wide rule, that is strictly enforced in this community.

Community stats

14K
Monthly active users
10K
Posts
199K
Comments

Community stats

Community moderators