Email or username:

Password:

Forgot your password?
Top-level
Jess👾

A lot of vendors make it intentionally difficult to even do manual validation and deployments these days. Windows, Chrome, Edge, Adobe, etc. all really want to auto update. One of the odd parts of this particular outage is that CrowdStrike updates are SUPPOSED to go out in stages where your test machines are on update N, staging machines are on update N-1, and prod machines are on N-2. So somehow they not only made a bad update, but they also violated their own release cadence by pushing it out to all machines no matter what version they're scheduled to be on.

@puck
@Aphrodite @calamari

12 comments
Andrew

@JessTheUnstill @Aphrodite @calamari Agree and understand the vendors want auto update. They need to be told where to stick that idea. My experience is a lot of that is due to the updates on MS Windows being hard to manage.

Interesting to hear about the CrowdStrike release cadence. I've never used it.

In my world, we manage what is released to servers and when.

Jess👾

You can do similar with Windows. They have "update rings". It lets you keep your systems on auto-update so IT doesn't have to manually faf with it, but you can have canaries before prod borks.

learn.microsoft.com/en-us/mem/
@puck
@Aphrodite @calamari

Andrew

@JessTheUnstill @Aphrodite @calamari Excellent!

However, that is for the software (incl drivers) that Microsoft supply. What about all the other random software you need to install?

Jess👾

I can't remember about Office or Adobe or Chrome whether they have things like that. It's been a few years since I worked at a Windows corp and interacted with endpoint engineering.
@puck
@Aphrodite @calamari

Andrew

@JessTheUnstill @Aphrodite @calamari Fair enough, I don't work in one either (thankfully). Interesting discussion though! Thank you.

Michael Potts (HMHackMaster)

@puck
Zscaler was very confused when I told them we would not be using their auto-update infra (even though theirs did allow for rings and stuff). We have an org-wide phased update process and we just included the zscaler client.

My management didn't like the idea as they were happy to transfer responsibility to zscaler as then it wouldn't be their fault if it broke.

I won though...

@JessTheUnstill @Aphrodite @calamari

Michael Potts (HMHackMaster)

@puck
I think some orgs value that "vendor is responsible, so it's not my fault" too much. Sure, the manager's head isn't gonna roll over this incident but productivity died and that's gonna upset a ton of people in the org.

@JessTheUnstill @Aphrodite @calamari

Andrew

@hmhackmaster Excellent to hear about your success, and that you've been vindicated (yeah, different tool, but same context)!

And agreed, many orgs will try to transfer responsibility. Will be interesting to see how well that goes.

Michael Potts (HMHackMaster)

@puck I care more about uptime and reliability than the blame game. But I am also the kind of person who has a reputation for making reasonable decisions and assuming responsibility when this things go wrong.

If taking responsibility (and not dodging accountability) costs me my job then that's clearly a sign the org has lost confidence in me and it was time to move on anyways.

Hasn't happened to me yet though, and I have made some pretty big mistakes!

Grant Gould

@hmhackmaster @puck @JessTheUnstill @Aphrodite @calamari
Much like "if you haven't done your restore procedure, you don't have a backup procedure," if you haven't actually invoiced or sued a vendor for screwing up, you haven't actually transferred liability to your vendor.
Vendor accountability is 99% imaginary.

Michael Potts (HMHackMaster)

@nonnihil I think you are completely right from a business point of view, but from some upper-management person's viewpoint the "it's the vendors responsibility" is the path to ensure their decision can't come back to bite them.
Whatever VP or CISO who approved Crowdstrike for an org isn't gonna lose their job over this.

@puck @JessTheUnstill @Aphrodite @calamari

Jess👾

And honestly, there's no way that you CAN'T put some level of trust in your suppliers. Whether it's AWS or Google Workspace or Windows or Microsoft365 or any of your anti-malware vendors or anything else, if they have a major outage, it's going to cripple your business for a while. They'll build terms into the contract about stability and reliability, but at the end of the day, if one of your critical suppliers fucks up, it's going to take you down. You pick the least bad of the options and pray.

@hmhackmaster
@nonnihil @puck @Aphrodite @calamari

And honestly, there's no way that you CAN'T put some level of trust in your suppliers. Whether it's AWS or Google Workspace or Windows or Microsoft365 or any of your anti-malware vendors or anything else, if they have a major outage, it's going to cripple your business for a while. They'll build terms into the contract about stability and reliability, but at the end of the day, if one of your critical suppliers fucks up, it's going to take you down. You pick the least bad of the options and pray.

Go Up