Email or username:

Password:

Forgot your password?
Stuart

In an unexpected turn of events, a sensible take on #Crowdstrike from the Orange Site.

Source: news.ycombinator.com/item?id=4

112 comments
Donald Ball

@calamari I have a whole theory about how the development processes that SOC2, Fedramp, etc. all but mandate in order to survive an audit freeze the design of covered systems, often prematurely, and actively impede the evolved practices that might otherwise have improved their quality and reliability.

Wanja

@donaldball @calamari I don't know SOC2 that well but I do work on critical infrastructure that is certified (with 0 findings!) to a similar German standard despite not using any of these scary products.

Yes you need to explain to your auditor how you intend to meet your security objectives despite not having bought the proprietary appliance that claims to magically make you do that. But you'll manage.

Wanja

@donaldball @calamari "If you don't buy XYZ you'll surely fail your audit" is repeated across the industry as a truism but barely ever put to the test.

Steffen Weinreich

@muvlon @donaldball @calamari "it tixs a box". As long as it easier to deploy a software which tix a box as to discuss with your auditor each and every year why you insist do do it by yourself we will see incidents like this today 🤷

pebcak

@muvlon @donaldball @calamari exactly. you can work with the regulatory entities & auditors, but you have to know what you do.

Steve🏳️‍🌈

@donaldball @calamari

This is far more about the companies doing the implementations than the compliance frameworks themselves. Companies will do the bare minimum to pass audit and then completely ignore the ongoing audits, assessments, and improvement cycles demanded by the compliance framework. With SOC2 you can get away with some of this, less so in FedRamp, and even less so in ISO but companies don't want to spend money to mitigate risks - all they want is tech magic they can ignore.

Daniel Farina

@gaysteve @donaldball @calamari this sounds like the most accurate description to me. I’ve been through the SOC process a few times, I can see how companies want to take some mostly reasonable norms on what they’re supposed to audit and try to abstract it to a software package.

I have always found the anti-malware norms both reasonable in principle and vexing in implementation myself. This is where invasive endpoint software shows up.

Aphrodite ☑️ :boost_ok:

@calamari

Checklists are only as useful as the knowledge necessary to know why the checklist exists.

Pilots and surgeons train for extensive periods so they learn why they need to go through their checklists.

What happens far too often is checklists turning into ritual disconnected from the rationale.

Religion often has this problem. Many of the rituals of religion have roots in Something Deep From Back In The Day, but that link, with time, has since worn away.

Tim Hergert

@Aphrodite @calamari we've progressed from "cargo cult" to "checklist cult"

At least with the former, we got to build cool bamboo models of planes and control towers.

Aphrodite ☑️ :boost_ok:

@cjust @calamari

tbh the Adeptus Mechanicus of 40K make too much sense in that framing

they don’t know why tech works, they just know to do the rituals and they can make a thing

Tim Hergert

@Aphrodite @calamari I spent far too long one weekend looking into nuclear semiotics and have decided that the best thing that we could do for future generations is genetically engineer a cat species to glow in the presence of radiation rather than try to instill a nuclear priesthood.

I think that this same logic should be applied to software QA as well. I'm certain that we can bioengineer a cat to glow in the presence of a faulty AV update. Then we can change the checklist item to "□ IT department properly equipped with glowy cats"

@Aphrodite @calamari I spent far too long one weekend looking into nuclear semiotics and have decided that the best thing that we could do for future generations is genetically engineer a cat species to glow in the presence of radiation rather than try to instill a nuclear priesthood.

I think that this same logic should be applied to software QA as well. I'm certain that we can bioengineer a cat to glow in the presence of a faulty AV update. Then we can change the checklist item to "□ IT department...

TomDB 🦣

@Aphrodite @cjust @calamari the last paragraph seems to apply to modern day youth using AI to do homework as well.

Sebastiaan Dammann

@cjust @Aphrodite @calamari That's because the reviewing the checklists can then be - no offence intended - offloaded to cheap workers in 2nd and 3rd world countries who are judged by the checklists they sign off. There is no room for critical thinking or adapting to the particular situation. I see this happening daily.

Twirrim

@Aphrodite @calamari
I'll have to see if I can find the study, but a study was done at a US hospital to compare reported checklist completion with actual checklist completion.

They found out of many dozens of entries in the list that it was rare for any of them to actually be done, including the one to double check the patient's name prior to anaesthetic, to make sure they're about to operate on the right person!

Dweebish

@Twirrim @Aphrodite @calamari As someone who's had a lot of surgeries & procedures involving anesthetic in the past 5 years, I'm quite thankful that my identity has been checked and re-checked every time.

Andrew

@Aphrodite @calamari an even more important check box is change management. How can you have effective change management when updates are applied automatically? If compliance frameworks require automatic updates, then they're broken, and given what has just happened, I really hope they'll be fixed.

Sure, have EDR etc, but the updates need to be validated, then rolled out by the organisations.

Sadly, as the world just discovered, there is no silver bullet when it comes to security.

Jess👾

A lot of vendors make it intentionally difficult to even do manual validation and deployments these days. Windows, Chrome, Edge, Adobe, etc. all really want to auto update. One of the odd parts of this particular outage is that CrowdStrike updates are SUPPOSED to go out in stages where your test machines are on update N, staging machines are on update N-1, and prod machines are on N-2. So somehow they not only made a bad update, but they also violated their own release cadence by pushing it out to all machines no matter what version they're scheduled to be on.

@puck
@Aphrodite @calamari

A lot of vendors make it intentionally difficult to even do manual validation and deployments these days. Windows, Chrome, Edge, Adobe, etc. all really want to auto update. One of the odd parts of this particular outage is that CrowdStrike updates are SUPPOSED to go out in stages where your test machines are on update N, staging machines are on update N-1, and prod machines are on N-2. So somehow they not only made a bad update, but they also violated their own release cadence by pushing it out to...

Andrew

@JessTheUnstill @Aphrodite @calamari Agree and understand the vendors want auto update. They need to be told where to stick that idea. My experience is a lot of that is due to the updates on MS Windows being hard to manage.

Interesting to hear about the CrowdStrike release cadence. I've never used it.

In my world, we manage what is released to servers and when.

Jess👾

You can do similar with Windows. They have "update rings". It lets you keep your systems on auto-update so IT doesn't have to manually faf with it, but you can have canaries before prod borks.

learn.microsoft.com/en-us/mem/
@puck
@Aphrodite @calamari

Andrew

@JessTheUnstill @Aphrodite @calamari Excellent!

However, that is for the software (incl drivers) that Microsoft supply. What about all the other random software you need to install?

Jess👾

I can't remember about Office or Adobe or Chrome whether they have things like that. It's been a few years since I worked at a Windows corp and interacted with endpoint engineering.
@puck
@Aphrodite @calamari

Andrew

@JessTheUnstill @Aphrodite @calamari Fair enough, I don't work in one either (thankfully). Interesting discussion though! Thank you.

Michael Potts (HMHackMaster)

@puck
Zscaler was very confused when I told them we would not be using their auto-update infra (even though theirs did allow for rings and stuff). We have an org-wide phased update process and we just included the zscaler client.

My management didn't like the idea as they were happy to transfer responsibility to zscaler as then it wouldn't be their fault if it broke.

I won though...

@JessTheUnstill @Aphrodite @calamari

Michael Potts (HMHackMaster)

@puck
I think some orgs value that "vendor is responsible, so it's not my fault" too much. Sure, the manager's head isn't gonna roll over this incident but productivity died and that's gonna upset a ton of people in the org.

@JessTheUnstill @Aphrodite @calamari

Andrew

@hmhackmaster Excellent to hear about your success, and that you've been vindicated (yeah, different tool, but same context)!

And agreed, many orgs will try to transfer responsibility. Will be interesting to see how well that goes.

Michael Potts (HMHackMaster)

@puck I care more about uptime and reliability than the blame game. But I am also the kind of person who has a reputation for making reasonable decisions and assuming responsibility when this things go wrong.

If taking responsibility (and not dodging accountability) costs me my job then that's clearly a sign the org has lost confidence in me and it was time to move on anyways.

Hasn't happened to me yet though, and I have made some pretty big mistakes!

Karl Baron

@Aphrodite @calamari When a company gets hacked and sued, they have to answer to "were you negligent in protecting against this or were you just unlucky?". Courts are incompetent in determining this, and companies are mostly actually negligent (because they don't want to pay for it), so we get these "best practices" checklists instead.

How do you legislate competence? Most companies can't even determine if the people they hire are competent!

Ryan Boswell 🏳️‍🌈

@calamari If I had a dollar for every hour that Zscaler disrupted someone on my team or one of our stakeholders internally because of how aggressive it got, I’d be able to retire.

feld
@calamari regulatory bodies will NOT be held accountable, so this is just wishcasting
Hudsoncress

@calamari What's interesting about this is that best practice is to be on n-2, or two versions behind on driver updates. Which we are. But this was a policy update, or a channel update where they modified modified the detections such that it borked ALL versions of the driver. TL,DR senior leadership assumed we would be covered to prevent this, but n-2 doesn't mean what everyone thought it meant.

Frazell Thomas

@hudsoncress @calamari They don’t like to be behind on security updates though. These were definition files so being n-2 would mean exposure to 1 and 2 day critical security vulnerabilities.

This isn’t the first major crisis caused by rapid fire security updates. It won’t be the last.

Hudsoncress

@LogicalApex @calamari. It's just wild that Crowdstrike pushes that apparently untested definition file globally, and was able to hit hundreds of millions of endpoints before anyone saw it was literally breaking every computer it touched? I mean... WFT

Hudsoncress

@LogicalApex @calamari also, quoth the vendor, "There is no way at this time" to turn off channel updates... SLT is gonna love that.

Andreas K

@hudsoncress @LogicalApex @calamari
I'm sure the contracts make sure that the vendor pays for all damages they caused, RIGHT?

Because if not, if it were malware, there would be at least hope that the responsible would be prosecuted at some point.

Frazell Thomas

@yacc143 @hudsoncress @calamari Companies put boiler plate language in their contracts. They either absolve themselves of any liability for damages or limit their liability to your license fee. Probably also includes a mandatory arbitration clause to further limit liability fallout.

I bet that’s the case here too.

😬

Hudsoncress

@LogicalApex @yacc143 @calamari what’s interesting is how we all assumed n-2 would save us from this but nobody was clear beforehand that the real risk was a policy update, not a driver version.

Andreas K

@LogicalApex @hudsoncress @calamari Interestingly, so they sell you a product that does something, on most days what the sales prospectus says, and on some days destroys your IT, and say enjoy, you cannot sue us, and the IT crime laws don't apply to us, as you voluntary provided us with access to all your IT.

Now purely as the IT guy, that is GREAT.

Andreas K

@LogicalApex @hudsoncress @calamari
And BTW, myx current employer asked exactly for that and more, they asked in what the CEO said was IT standard boilerplate that I as the little IT contractor would make them whole not only for my mistakes, but also for all products I used or that they asked me to use.

Admittedly he crossed out this paragraph when I explained to him the issues ;)

Tak!

@calamari Oh yeah, zscaler is a goddamned plague too, it's only a matter of time before it causes a massive outage and/or breach (in contrast to the constant mini-outages it causes during normal operation)

Vincent :coffeecup:

@calamari Not wrong, but I still feel like pushing out a critical update that breaks global commerce isn't entirely on the airlines and banks.

Delta Airlines has to check that box. CrowdStrike sold them a product saying that they can safely check that box with their product.

Jeff Craig

@vincent @calamari It's still at least somewhat on Delta and others for allowing CrowdStrike to blindly update their entire fleet without using progressive rollouts and canaries.

There is a lot of fault at play here that a more thoughtful approach to compliance would have made this bad, but not catastrophic.

Andreas K

@vincent @calamari
Did Delta insist that CrowdStrike indemnify them in case that their product breaks their IT for commercial damages their product causes?

(Then CrowdStrike perhaps would not be so willing to claim so many things.)

The whole thing of companies working without accountability mega-sucks.

So I broke the Internet today, but there will be 0 feedback to prevent this in the future.

Ian Turton

@yacc143 @vincent @calamari I always recall a discussion I had with an aerospace engineer about liability for compiler bugs, and he said what was the point, why would they want to end up owning Cray Systems if their planes started falling out of the air. Of course that was back in the 90s when we didn't expect planes to fall out of the sky.

Andreas K

@ianturton @vincent @calamari
The point is that there needs to be some adjusting of the weighing functions.

And in our "capitalist" world, this happens by assigning costs.

As long, there is no direct cost for bugs, even catastrophic ones, companies will ship products with catastrophic bugs just to meet the schedule some marketing egg head invented for artificial reasons.

piofthings

@calamari what the difference between compliance and checkbox compliance… they are almost always checkbox exercise!!! Agree with everything you said! Number of times I have fought compliance auditors about outdated checkbox compliance requirements… sigh!!!

Mikalai

@calamari
Indeed, this is a good moral, you are responsible for whatever third party stuff you put into your critical paths

Bert

@calamari What I also find funny that orgs ask to have encrypted everything in the cloud and tha proceed to install an agent that has access to everything, has root powers and can send whatever it wants to some place on the internet.

Bit we can tick checkboxes.

noahlz

@calamari except this was not a software patch, it was a "content update" CEO's words. Very very difficult to put blame on orgs here, many of whom deliberately stay several versions behind latest for this reason.

Daniel Farina

@calamari funny thing is I don’t think SOC2 can be termed regulation precisely. The norms of what you put in SOC2 reports are, unless working with government, an emergent phenomenon of private industry expectations.

The basic framework of SOC2 is “you say you do these things, audit firm proves it to some extent.”

David Megginson

@calamari I'm willing to blame CrowdStrike for building a huge business around exploiting that organisational checklist dysfunction and then not bothering to take even basic precautions to avoid bringing all their customers down. After all, they're the ones who pitched themselves as the experts; their customers' sin was ignorance and misplaced trust.

Fintanz

@calamari also regulators enable tech monopolies, so everyone is forced to use the same software.

Viss

@calamari this has been an issue for fifteen years. the compliance/checkbox type of "security" isnt even security. its doing the legal bare minimum to avoid fines.

CatSalad🐈🥗 (D.Burch) :blobcatrainbow:

@calamari @Viss

[Orgs] are more scared of failing an audit than they are of the consequences failure of the underlying systems the audits are supposed to be protecting.

100% on the money with that bit.

J4YC33 ❌

@catsalad @Viss@mastodon.social This is *by far* one of the hardest hurdles to overcome in my job.

jollyrogue

@catsalad It’s on the money.

This one is getting printed and posted on my desk.

@calamari @Viss

Scott Williams 🐧

@calamari I don't think I can possibly agree with this more.

sortius

@calamari 100% on this. A big problem is, the people with the technical know how to say "this will fail in a big way" are either ignored or so jaded they don't speak up.

The people who didn't scream "this is shit" are as much to blame as the idiots in governments pushing through ham fisted legislation that's supposed to stop white collar crime, but never seems to

Tony Hoyle

@sortius @calamari I bet they did and were overruled by management.

I've had to be the one screaming but luckily small company and the CEO trusts me.

sortius

@tony @calamari yeh, I know how it feels to not be listened to. And nobody likes it when you spend a week saying "I told you so" in ways that aren't so blatant

root42

@calamari I am wondering when Zscaler will become a target for hackers. Zero trust? Well... using a central proprietary solution to "protect" a business from its employees sounds like nothing could go wrong....

Luka Rubinjoni

@calamari Once the org passes the audit, every problem is somebody else's problem.

Varyag

@calamari@mastodon.social this kind of thinking, is how disasters like Chernobyl happened. While this isn't as destructive ad a failed nuclear reactor, it keeps happening. At some point, some things should change.

Deborah Pickett

LB 👆🏻 Even our little factory has been getting pressure to deploy endpoint surveillance onto every user device, because some of our customers want us to have “cyber insurance” in order to do business, and the insurer lists endpoint threat detection along the things we should buy. Classic box-ticking behaviour.

Drew Mayo

@futzle true, but blaming “regulations” in that shot was also a hell of a red flag.

Raven667

@futzle these are good things, an opportunity to make your systems run well and reliably under the auspices of security compliance. You can choose _how_ to comply as long as you are willing to explain it to the auditor. Be proud of what you build and that shouldn't be a problem 💪

jollyrogue

@calamari I’ve run into people complaining about HIPAA not prescribing solutions, and this articulates exactly why that is such a bad idea.


@calamari "nobody gets fired for buying from IBM"

I guess I'm showing my age.

John Deters

@Qbitzerre @calamari "Nobody ever got fired for buying CrowdStrike"

... because nobody's fixed the HR team's computers yet.

Bruce Elrick

@calamari
One thing to remember in capitalism is that risk takes a back seat to liability.

Given two paths, one that reduces risk and the other that reduces liability, the system always rewards the latter decision, even when it increases overall risk.

Bruce Elrick

@calamari
To be fair to capitalism, this is probably a feature of hierarchical responsibility more than anything, therefore affects every bureaucracy, including most governments.

Jenny Fx

@calamari Hmm I've had workdays scuppered by ZScaler before too!

It is true though.

Tintinaus

@calamari Even worse at my company where they decided to outsource their entire IT department. That means no one in our company is in the direct decision making of any roll out of processes.

Raven667

@calamari checkbox compliance attitudes kinda tick me off, there are opportunities to effect thoughtful change but not if you don't respect the process or have the confidence to make decisions. You can comply with the letter of the rule and pass off responsibility or comply with the *spirit* of the rule and _take_ responsibility.

Eric Likness

@calamari Tonight on the PBS News Hour, they had Bruce Schneier as a guest to weigh in. His summary was, "It's all economics", companies want to grow fast and break things, and the markets want that. And companies that buy those products,... as you point out, are trying to tick a box on an audit form. If we did this the `right way` it would cost more money,...

Nic Lake :pika:

@calamari @Meyerweb And then there’s my org using both Crowdstrike AND Zscaler

Michał "rysiek" Woźniak · 🇺🇦

@calamari link?

What is the context? Is ZScaler the third-party software? Do you know how it was involved in this failure?

Matt Franz

@calamari Amen. Auditors are more likely to drive threat models than real adversaries.

Sebastiaan Dammann

@calamari oh man, I hate ZScaler with a passion. It makes our development machines so slow, and we can't even do something basic like checking if a deployed site runs a correct SSL certificate

aggeka

@calamari I see this all the time also.
People tend to forget (or have never learned) that not servicing your customers is also in violation of GDPR article 32 part 1 b)

🥲

Kevin Granade

@calamari ehhhhhhh from being lightly involved in compliance, I never saw a, "oh no suddenly there's new compliance!". It was *always* "we need to push all these features first because an exec has a bonus tied to it. (actual, literal months pass) Oh crap we're late now and have to rush out a fix".
I'm very skeptical of people throwing shade at regulatory bodies for this kind of thing.

Laurent Bloch

@calamari IS 27001 is a perfect example of the bureaucratic way of ensuring system security.

xenogon

@calamari This is well expressed and has much wider applicability than this case.

I know next to nothing about this sort of software - but this lesson is applicable to almost everything in modern heavily regulated life. Compliance is the enemy of function and of resilience. *In general.*

This could be attributed to poorly designed rules. But this misses the point. Rules are a substitute for thought and understanding, and there is no substitute.

Lists of issues to check are useful mnemonics, but actively harmful when the person checking doesn't actually understand the system. They are really only useful when made and used by the same person.

Try asking someone in the building trade why certain features of houses are built a certain way. You'll often hear "because it's a legal requirement", or you'll hear something made up on the spot which is incorrect or incoherent and sometimes physically impossible. The real reasons are often long lost and possibly no longer applicable.

The same issue was evident with covid, as soon as people stopped thinking about how the virus is transmitted, and talked instead about what they were and weren't allowed to do, we had lost.

@calamari This is well expressed and has much wider applicability than this case.

I know next to nothing about this sort of software - but this lesson is applicable to almost everything in modern heavily regulated life. Compliance is the enemy of function and of resilience. *In general.*

This could be attributed to poorly designed rules. But this misses the point. Rules are a substitute for thought and understanding, and there is no substitute.

Tokyo Outsider (337ppm)

@calamari @kushal Not sure this take is as sensible as it might appear. Regulations exist for a reason. Regulators are not responsible for an organisation's flawed attempt to satisfy the regulation. A business is responsible for managing its own risks and meeting regulatory requirements. No one has forced these businesses to operate with disregard for their own business risks.

Bèr Kessels 🐝 🚐 🏄 🌱

@calamari why unexpected, though? The orange site has most often a few sensible and nuanced takes on hot topics. They then bubble up and push the tech bro nonsense away almost always.

Not on the < 30 comment threads obv. But hot topics? I keep getting positively surprised there.

Marius Kießling

@calamari One thing I would add to this, which I have witnessed too often first hand, is the incompetence in evaluating those risks. When the consultants say that this should be done, there is no further probing into whether or not this actually mitigates or even introduces risks.

Orca🌻 | 🏴🏳️‍⚧️

@calamari@mastodon.social Heard of a company that would rather lock up account (pam faillock or sth) instead of using properly configured fail2ban/denyhosts to block password bruteforce attack based on IP because WOOHOO COMPLIANCE SAID SO smh ​:blobcatshrug2:​

georgebaily

@calamari we need better checkboxes for auditing how good the audit checkboxes are

Lewis Cowles

@calamari
I Like this, but it's not nearly unkind enough to folks who just do not engage brain at work, at all.

Compliance is meant to be a tool, to expose businesses to robust thinking around problems. IMO anyone, who knowingly does check-box compliance, (we'll just buy in / defer compliance) for any reason, should be shown the door.

David Fleetwood - RG Admin

@calamari At Amazon in 2019 I argued against the Falcon sensor on the hosts I was responsible for. I agreed their service was good, but the lack of transparency, control and even ability to test changes made it a no go for me. I asked why we weren't either building our own solution or just buying them.

I was overruled and I'm sure those systems, if they still exist, had a fun past few days.

Again, I like CS, but major tech companies are not short on resources or expertise.

Go Up