this post was submitted on 19 Jul 2024
353 points (95.4% liked)

linuxmemes

21210 readers
90 users here now

Hint: :q!


Sister communities:


Community rules (click to expand)

1. Follow the site-wide rules

2. Be civil
  • Understand the difference between a joke and an insult.
  • Do not harrass or attack members of the community for any reason.
  • Leave remarks of "peasantry" to the PCMR community. If you dislike an OS/service/application, attack the thing you dislike, not the individuals who use it. Some people may not have a choice.
  • Bigotry will not be tolerated.
  • These rules are somewhat loosened when the subject is a public figure. Still, do not attack their person or incite harrassment.
  • 3. Post Linux-related content
  • Including Unix and BSD.
  • Non-Linux content is acceptable as long as it makes a reference to Linux. For example, the poorly made mockery of sudo in Windows.
  • No porn. Even if you watch it on a Linux machine.
  • 4. No recent reposts
  • Everybody uses Arch btw, can't quit Vim, and wants to interject for a moment. You can stop now.

  • Please report posts and comments that break these rules!

    founded 1 year ago
    MODERATORS
     

    A global IT outage has caused chaos at airports, banks, railways andbusinesses around the world as a wide range of services were taken offline and millions of people were affected.

    In one of the most widespread IT crashes ever to hit companies and institutions globally, air transport ground to a halt, hospitals were affected and large numbers of workers were unable to access their computers. In the UK Sky News was taken off air temporarily and the NHS GP booking system was down.

    Microsoft’s Windows service was at the centre of the outage, with experts linking the problem to a software update from cybersecurity firm Crowdstrike that has affected computer systems around the world. Experts said the outage could take days from which to recover because every PC may have to be fixed manually.

    Overnight, Microsoft confirmed it was investigating an issue with its services and apps, with the organisation’s service health website warning of “service degradation” that meant users may not be able to access many of the company’s most popular services, used by millions of business and people around the world.

    Among the affected firms are Ryanair, Europe’s largest airline, which said on its website: “Potential disruptions across the network (Fri 19 July) due to a global third party system outage … We advise passengers to arrive at the airport three hours in advance of their flight to avoid any disruptions.”

    https://www.theguardian.com/australia-news/article/2024/jul/19/microsoft-windows-pcs-outage-blue-screen-of-death

    top 50 comments
    sorted by: hot top controversial new old
    [–] azenyr@lemmy.world 143 points 3 months ago (6 children)

    Having half of the world depend on a corporate proprietary single company is the stupidest thing ever. They will learn nothing with this, sadly

    [–] Thorry84@feddit.nl 45 points 3 months ago (4 children)

    While you are right, this outage has basically nothing to do with Windows or Microsoft. It's a Crowdstrike issue.

    [–] Diplomjodler3@lemmy.world 70 points 3 months ago (1 children)

    It also has to do with software updates being performed without the user having any control over them.

    [–] Thorry84@feddit.nl 38 points 3 months ago* (last edited 3 months ago) (2 children)

    Agreed, but again these updates were done by the Crowdstrike software. Nothing to do with Microsoft or Windows.

    In this case it was an update to the security component which is specifically designed to protect against exploits on the endpoint. You'd want your security system to be up to date to protect as much as possible against new exploits. So updating this every day is a normal thing. In a corporate environment you do not want you end users to be able to block or postpone security updates.

    With Microsoft updates they get rolled out to different so called rings, which get bigger and bigger with each ring. This means every update is already in use by a smaller population, which reduces the chances of an update destroying the world like this greatly.

    [–] Botzo@lemmy.world 26 points 3 months ago

    Best part? George Kurtz (crowdstrike CEO) won't be available for handling the fallout. He's busy racing this weekend.

    Car #04 in the entry list https://www.gt-world-challenge-america.com/event/95/virginia-international-raceway

    [–] cron@feddit.org 19 points 3 months ago (1 children)

    I absolutely expect vendors to push out new patterns automatically and as fast as possible.

    But in this case, a new system driver was rolled out. And when updating system software, I absolutely expect security vendors to use a staged rollout like everyone else.

    [–] Thorry84@feddit.nl 27 points 3 months ago (2 children)

    100% agreed, Crowdstrike fucked up with this one. I'm very interested to hear what went wrong. I assume they test their device drivers before deploying them to millions of customers, so something must have gone wrong between testing and deployment.

    Something like this simply cannot happen and this will cost them customers. Your reputation is everything in the security business, you trust you security provider to protect your systems. If the trust is gone, they are gone.

    [–] x1gma@lemmy.world 13 points 3 months ago

    I'm very interested to hear what went wrong.

    We'll probably never know. Given the impact of this fuck up, the most that crowdstrike will probably publish is a lawyer-corpo-talk how they did an oopsie doopsie, how complicated, unforseen, and absolutely unavoidable this issue has been, and how they are absolutely not responsible for it, but because they are such a great company and such good guys, they will implement measures that this absolutely, never ever again will happen.

    If they admit any smallest wrongdoing whatsoever they will be piledrived by more lawyers than even they'd be able to handle. That's a lot of CEO yachts in compensations if they will be held responsible.

    [–] thisbenzingring@lemmy.sdf.org 8 points 3 months ago (1 children)

    One time years ago, Sophos provided an update the blocked every updater on the machine. Each computer had to be manually updated. They are still in business. My point is that this isnt the first and wont be the last time it happens.

    [–] Thorry84@feddit.nl 7 points 3 months ago

    Yeah, I mean Microsoft can release something like Windows 11 and still be in business, so I don't expect a lot will change. But if you had any stocks in Crowdstrike, RIP.

    [–] CalcProgrammer1@lemmy.ml 12 points 3 months ago* (last edited 3 months ago)

    It's not specific to Microsoft, but the general idea of letting proprietary software install whatever it wants whenever it wants directly into your kernel is a bad idea regardless. If the user had any control over this update process, organizations could do small scale testing themselves before unleashing the update on their entire userbase. If it were open source software, the code would be reviewed by many more eyes and tested independently by many more teams before release. The core issue is centralizing all trust on one organization, especially when that organization is a business and thus profit-driven above all else which could be an incentive to rush updates.

    load more comments (2 replies)
    [–] ImplyingImplications@lemmy.ca 17 points 3 months ago

    Reminds me of when Canada lost internet to 12 million of it's 33 million people because one company messed up doing maintenance.

    [–] Damage@slrpnk.net 11 points 3 months ago

    There will be no consequences for those who made this choice because going with the biggest suppliers is never wrong: they in theory have the highest reliability, and even if they don't, then it's not just your problem but everyone else's too, can't blame those responsible when the outage is akin to an "act of God"

    [–] ChocoboRocket@lemmy.world 9 points 3 months ago* (last edited 3 months ago)

    Are you suggesting lower cost and some convenience in exchange for incomprehensible risk is somehow a bad deal?

    [–] drathvedro@lemm.ee 9 points 3 months ago (1 children)

    It's great to have alternatives. If it was all linux, and linux got hit, then it'd be the entire world in danger. Too bad M$ is just not good enough for it's second most popular position.

    load more comments (1 replies)
    [–] nova_ad_vitum@lemmy.ca 5 points 3 months ago

    Agreed on both counts. This happened because Microsoft made adoption easy. And this will be fixed within a day. None of the fundamentals have shifted. Even though it's stupid, this isn't going to fundamentally shake anything up.

    [–] slazer2au@lemmy.world 81 points 3 months ago (1 children)

    Windows PC running Crowdstrike.

    [–] deathmetal27@lemmy.world 9 points 3 months ago (1 children)
    [–] jmcs@discuss.tchncs.de 36 points 3 months ago (3 children)

    The OS getting fully bricked because of a third party software update is still very much a OS level fuck up.

    [–] Robin@lemmy.world 43 points 3 months ago (2 children)

    Depends. Since this is security software it probably has a kernel driver component. I think in linux a 3rd party kernel module could do the same. But the community would not accept closed source security software, especially not in the kernel.

    [–] bjoern_tantau@swg-empire.de 21 points 3 months ago (2 children)

    They even have a version for Linux, which is a kernel module.

    load more comments (2 replies)
    [–] jmcs@discuss.tchncs.de 4 points 3 months ago* (last edited 3 months ago)
    [–] qjkxbmwvz@startrek.website 8 points 3 months ago

    My Debian system was bricked when it "upgraded" to systemd.

    Required attaching a monitor to a normally headless server to fix. (Turns out systemd treats fstab differently and can hang booting if USB drive isn't attached.)

    Steam, a 3rd party program, has nuked the home directory of users who didn't really do anything wrong.

    Programs have huge abilities to bork systems, be it Windows or Linux...

    load more comments (1 replies)
    [–] Jozzo@lemmy.world 38 points 3 months ago (4 children)

    Got hit with this in the middle of work. We only have one customer using CrowdStrike, and only staff PCs, no infrastructure. But this one is REAL bad, caused by turning your PC on, and cannot be patched - each affected PC needs to be manually fixed. Would not be surprised to see Linux usage go up after this.

    [–] Thorry84@feddit.nl 37 points 3 months ago (1 children)

    More likely people switch from Crowdstrike to another security/audit software provider. And not to put too fine a point on it, but Microsoft will probably sweep up a lot of fleeing Crowdstrike customers with their Sentinel products.

    [–] cron@feddit.org 9 points 3 months ago (1 children)

    This seems like a huge win for Microsoft

    [–] Thorry84@feddit.nl 18 points 3 months ago (2 children)

    They are suffering from fallout because of media outlets like the one linked in this post that point the finger at Microsoft and Windows, but I feel this isn't really fair.

    If the kernel module Crowdstrike uses for Linux systems had failed everybody would rightfully point the finger at them for screwing up. But it probably wouldn't be news since their Linux solutions aren't as widespread as their Windows solutions are.

    If a Windows update would have caused this kind of thing, pointing the finger at Microsoft is justified. But Microsoft has many policies in place that prevent this kind of thing from happening. Their ring based rollout for Windows Updates pretty much exclude this kind of thing from happening.

    load more comments (2 replies)
    load more comments (3 replies)
    [–] Nikls94@lemmy.world 37 points 3 months ago* (last edited 3 months ago) (4 children)

    Everyone shitting on windows, yet this thing exists on Linux as well… I also started to dislike windows, yet this is not the time to be against windows users, this is to go against Cloudstrike together for even letting this happen.

    [–] renzev@lemmy.world 19 points 3 months ago (1 children)

    I agree. I also think part of the blame can be placed on the system administrators who failed to make a recovery plan for circumstances like these -- it's not good to blindly place your trust in software that can be remotely updated.

    In Linux, this type of scenario could be prevented by configuring servers to make copy-on-write snapshots before every software upgrade (e.g. with BTRFS or LVM), and automatically switching back to the last good snapshot if a kernel panic or other error is detected. Do you know if something similar can be achieved under Windows?

    [–] Nikls94@lemmy.world 4 points 3 months ago

    Sadly, I don’t know. I’m way worse with computers than I want to be, just careful about where I get my information.

    [–] Tier1BuildABear@lemmy.world 11 points 3 months ago

    It's the time to go against proprietary monopolizing software

    [–] doubletwist@lemmy.world 10 points 3 months ago* (last edited 3 months ago)

    Exactly, the blame here is entirely on Crowdstrike. they could just as easily have made similar mistake in an update for the Linux agent that would crash the system and bring down half the planet.

    I will say, the problem MIGHT have been easier to fix or work around on the Linux systems.

    [–] ianhclark510@lemmy.blahaj.zone 4 points 3 months ago

    Citation needed, my NUC running Fedora made it through this without a hitch

    [–] thisbenzingring@lemmy.sdf.org 32 points 3 months ago (3 children)

    .... my work uses Crowdstrike

    I didn't see any issues rise up yesterday. Is today gonna be a bad day?

    [–] Fuck_u_spez_@sh.itjust.works 33 points 3 months ago

    The second I read this post my phone started blowing up. Good luck brother.

    [–] thisbenzingring@lemmy.sdf.org 22 points 3 months ago (1 children)

    I made an announcement on our Teams channel, and its blowing the fuck up.... today is going to be a bad day :(

    [–] jemikwa@lemmy.blahaj.zone 8 points 3 months ago

    This occurred overnight around 5am UTC/1am EDT. CS checks in once an hour, so some machines escaped the bad update. If your machines were totally off overnight, consider yourself lucky

    [–] SapphironZA@sh.itjust.works 24 points 3 months ago (2 children)

    What amazes me is that so many big companies still use windows in critical core infrastructure.

    Windows endpoints is one thing, but anyone using windows servers and MSSQL for mission critical application stacks need to be hit with the modernization hammer.

    And then on top of that, they do not have a test rollout of any changes in a test environment, before rolling it out in the production stack.

    Good luck to all the engineers in the trenches, having to fix the mistakes of their leadership.

    [–] jubilationtcornpone@sh.itjust.works 11 points 3 months ago (2 children)

    There are many, many, many specialized enterprise applications out there that are windows only.

    [–] joewilliams007@kbin.melroy.org 12 points 3 months ago

    not when it comes to server software. In that regard, linux is infront.

    [–] SapphironZA@sh.itjust.works 3 points 3 months ago

    Way too many. It's not the 90's or early 2000s anymore.

    load more comments (1 replies)
    [–] nightrunner@lemmy.world 11 points 3 months ago* (last edited 3 months ago)

    Windows Server OSes running CrowdStrike affected too

    [–] crazyminner@lemmy.ml 7 points 3 months ago (2 children)

    What does the issue do?

    My first company I worked for used crowdstrike. Does it think the computer is infected and locking them down?

    [–] Barbarian@sh.itjust.works 15 points 3 months ago

    They pushed a driver update. That update is broken and causes the bootup sequence to fail.

    [–] possiblylinux127@lemmy.zip 8 points 3 months ago
    load more comments
    view more: next ›