Had a bunch of thoughts about the recent safety stuff,...

Had a bunch of thoughts about the recent safety stuff, way more than fit in social media post... Blog post story time! (It's a bit of a ramble, sorry about that...)

https://chandlerc.blog/posts/2024/11/story-time-bounds-checking/

#LLVM #Clang #MemorySafety

Like 17 November at 1:04 | Open on hachyderm.io

15 comments

zwarich

@chandlerc One important factor in the slow march to bounds check optimization is that LLVM traditionally lagged behind both GCC and research papers in optimizations based on integer range information. I'm not even sure when/how this got better, since it was after I was following LLVM closely.

17 November at 1:28 | Open on hachyderm.io

Per Vognsen

@zwarich @chandlerc The newer ConstraintElim pass in LLVM made a relatively big difference in generated code quality when it comes to non-trivial bounds check elimination.

17 November at 1:48 | Open on mastodon.social

Paul Khuong

@pervognsen @zwarich @chandlerc there must be a lot of low hanging fruit if gcc is the benchmark :x SBCL(!) used to be stronger at integer range propagation than gcc for finer than power of 2/word size information. OTOH, I guess using range info to generate simpler div by mul is pretty niche.

17 November at 2:05 | Open on discuss.systems

Owen Anderson

@zwarich @chandlerc In the early days Chris was very opposed to investment in this area on compile time grounds.

17 November at 6:25 | Open on mastodon.online

John Regehr

@chandlerc do you know if there's a pass or small set that do most of the heavy lifting for bounds check elimination? like maybe constraint elimination? I love that one, it had been needed for a while, and it seems to do a really great job with ubsan checks

17 November at 1:41 | Open on mastodon.social

Chandler Carruth

@regehr That's the one I've heard others mention. But I've not been following LLVM closely enough to see some of these recent developments at that granularity.

17 November at 4:03 | Open on hachyderm.io

dist1ll

@chandlerc I'm not a fan of using %-diffs to make an argument around effectiveness of performance improvements. More often than not, these numbers just lead people astray.

For all we know, the 0.3% penalty might just be so small because it's being overshadowed by some other severe inefficiency in the codebase.

There's an interesting effect where inefficient code will suffer *less* from adding *more* inefficient code, because it's already bottlenecked.

17 November at 4:16 | Open on mastodon.social

Chandler Carruth

@dist1ll So, if you look at the referenced blog post[1], we actually clarified what this represented. This is 0.3% across Google's entire main production fleet. Our fleet performance is dominated by the hottest services, a relatively small %-age, your classical long-tailed distribution. Those services are **incredibly** optimized systems. We have large teams doing nothing but removing every tiny inefficiency we can find.

[1]: https://security.googleblog.com/2024/11/retrofitting-spatial-safety-to-hundreds.html

Expand text...

17 November at 4:27 | Open on hachyderm.io

Chandler Carruth

@dist1ll We've also published pretty in-depth articles about this environment if you want to get a better sense:

https://dl.acm.org/doi/abs/10.1145/2749469.2750392
https://dl.acm.org/doi/abs/10.1145/3620666.3651350

17 November at 4:27 | Open on hachyderm.io

dist1ll

@chandlerc (Thanks for the articles and response)

I'm curious, how much of that optimization is done on the infra side, compared to the application side? I was under the impression that orgs prioritizing infra optimizations, like PGO, data structures, stdlib stuff like memcpy, improving compilers etc.

Perhaps I'm way off base. I guess what I'm curious about is how much effort is spent on application-specific optimizations, things that perhaps *don't* carry over to other parts of the codebase.

17 November at 4:38 | Open on mastodon.social

Chandler Carruth

@dist1ll The larger applications have their own teams driving application-side optimizations. That covers a *lot* of the larger applications.

And we then also have a large team that drives infrastructure level optimizations just like what you mention.

It's a joint effort, and both teams talk extensively. So for these systems, they are *very* well optimized. There are huge incentives to find and fix any significant inefficiencies.

17 November at 4:41 | Open on hachyderm.io

dist1ll

@chandlerc Makes sense. In that case, congrats for getting such low overheads! Happy to see much of the long-standing FUD around efficient spatial safety challenged.

17 November at 4:47 | Open on mastodon.social

tone ˥˦ sandhi ˧˧ police ˨‌˦

@chandlerc
It reminds me how many programming languages got a little bit faster when processors started using branch target prediction for dynamic jumps/calls, which weren't optimized earlier because they are rare in C code.

17 November at 5:45 | Open on is.nota.live

geofflangdale

@chandlerc This seems very reasonable. In the dim past, I worked on "Omniware" (a low-level IR plus Software Fault Isolation) and we were surprised - even then - at how cheap adding more instructions to a basic block was. That's with 90s hardware - way less deep/wide than 2020s h/w, which should make 'instrumentation' costs even lower.

17 November at 8:02 | Open on mastodon.social

JVApen

@chandlerc I completely agree with your conclusion. At the same time there is a lot of (proprietary) legacy code that will be used for years without anyone actively looking to improve its security. Are there efforts ongoing to make these and previous hardening options the defaults for C++ and C in the compilers?

17 November at 9:12 | Open on mastodon-belgium.be