Email or username:

Password:

Forgot your password?
Top-level
Gabriele Svelto

Plotting these types of crashes against time yields interesting trends: the more machines age the more likely they are to encounter hardware-related failures. You might think that's obvious, and indeed it is, but until now the industry has looked the other way, based on the hand-wavy excuse that hardware failures were less common than bugs. 13/17

6 comments
Gabriele Svelto replied to Gabriele

So what needs to change? First of all, error detection and correction must become commonplace. You can already build a desktop machine with #ECC memory, but it's uncommon in laptops, even mobile workstations, and completely absent on phones and other consumer appliances. This will measurably lengthen the usable life of these devices. 14/17

Gabriele Svelto replied to Gabriele

Note that detection is more important than correction. The user needs to know that there's something wrong without having to run a memory testing program. Think of the lights that turn on in cars if something's malfunctioning, or the error beeps that your washing machine makes when it thinks it's leaking water. These are extremely common, they need to be on computing devices too. 15/17

Gabriele Svelto replied to Gabriele

Finally hardware design must change to make devices repairable and prolong their useful life. Yes, I'm looking at non-ECC memories soldered on the motherboard or worse, on the same substrate as the CPU. 16/17

Gabriele Svelto replied to Gabriele

To end the thread I'd like to thank my colleagues Alex Franchuk and @willcage who did the implementation work and my boss Gian-Carlo Pascutto who plotted crashes against machine age. I'd also like to point out that we've got preliminary data on the topic, but I fully intend to write a proper article with a detailed analysis of the data. 17/17

Gorgeous na Shock! replied to Gabriele

@gabrielesvelto My dream for a while had been that ECC memory becomes as commonplace as encryption suddenly did circa ~2013 and not just some weird thing only I and a few of my nerd friends do because we're overcautious and weird. 😌

William D. Jones replied to Gabriele

@gabrielesvelto Really depressing that we've reached the physical limits of creating "memory we're confident that actually will store it's value reliably" :(.

We've went from PARITY CHECK 1/2 to "memory works fine without detection or correction" to "oh now not even parity check is enough". In that sense, it's WORSE than 40 years ago :P.

Go Up