Email or username:

Password:

Forgot your password?
Top-level
Ken Shirriff

Smart mathematicians figured out Pentium's division algorithm and the missing entries in 1995 by examining the pattern of errors. But I can confirm it in silicon. Moreover, I see 16 missing entries in the table, not just 5, but 11 of them don't cause errors due to luck. 5/9

18 comments
Ken Shirriff

Intel claimed the bug was due to an error in a script to download the entries into the PLA. But due to the 16 missing entries, I think they made a mathematical error in constructing the table, misjudging the effect of a 7-bit adder. Here's the adder, just above the PLA. 6/9

A closeup of the adder and test circuitry just above the division PLA. I removed the metal layers to show the silicon and polysilicon. Transistors are visible as dark regions. The circuitry is mostly organized as repeating blocks, one for each bit. At the top are 8 blocks for the 8 bit adder's sum, generate, and propagate signals. (Only 7 bits of the adder are used.) Below, complex carry lookahead circuitry computes carries in parallel to make addition fast. Below that, 8 XOR gates apply the carries. Next, multiplexers select values for testing, fed into an 11-bit shift register (LFSR) and a 13-bit shift register to test the PLA. At the bottom, larger transistors (including bipolar ones) implement drivers to send signals throughout the adder and to the rest of the processor.
Ken Shirriff

You'd expect that Intel fixed the problem by adding the 5 missing entries. Instead, they filled *all* the unused entries with 2's. This made the table easier to store in a PLA, shrinking it by 1/3. The fixed PLA has lots of unused rows at the bottom. 7/9

A closeup of the PLA circuit for the fixed Pentium showing numerous unused rows at the bottom.
Ken Shirriff

Intel said the FDIV bug was unimportant, but the public disagreed. Newspapers and TV discussed the bug. Intel claimed the bug would happen every 27,000 years; IBM said every 24 days and stopped selling Pentiums. Intel gave in and replaced Pentiums at a cost of $475 million. 8/9

Screenshot of a New York Times article in the front of the business section titled "Flaw Undermines Accuracy of Pentium Chips."
Ken Shirriff

I hope to have a blog post with more details on the Pentium FDIV bug soon. Until then, you can read about the Pentium Navajo rug: oldbytes.space/@kenshirriff/11
9/9

Ted Spence

@kenshirriff fascinating! Love to hear about the PLA space saving techniques.

Mark T. Tomczak

@kenshirriff I remember this happening.

There was this odd little movie Intel put together that was advertainment for the whole project; for some reason, I saw it at the local science museum on the big IMAX screen when I was, what, eight?

The plot, hilariously, revolved around aliens trying to disrupt human technological progress by... Messing with the chip blueprint before it's fabricated. They're caught out by the hero-kids who save the day.

We always thought it was a wild coincidence that IRL the chip went into production with a significant design flaw analogous to the one in the fiction.

@kenshirriff I remember this happening.

There was this odd little movie Intel put together that was advertainment for the whole project; for some reason, I saw it at the local science museum on the big IMAX screen when I was, what, eight?

The plot, hilariously, revolved around aliens trying to disrupt human technological progress by... Messing with the chip blueprint before it's fabricated. They're caught out by the hero-kids who save the day.

Jo

@kenshirriff

But I made that bug happen a bunch of times in Lotus 123 (I don't think I had Excel at the time) when I was a kid.

So pretty often if you tried! And I remember getting a clockspeed upgrade (60 -> 90 MHz iirc) when Intel sent us a new CPU.

Ken Shirriff

@ElsaPreme The Pentium division bug is deterministic, so you can make it happen all day long if you do a particular division. The lesser-known 386 multiplication bug, on the other hand, was a circuitry issue that depended on the voltage, frequency, and temperature, so it was unpredictable.

John Carlsen πŸ‡ΊπŸ‡ΈπŸ‡³πŸ‡±πŸ‡ͺπŸ‡Ί

@kenshirriff

I was buying computers for the video game developer I worked at. One department used 3D Studio, and we saw the effects of the Pentium defect clearly on the screen.

At first Intel downplayed the problem, saying nobody would be affected. Then they said they'd replace CPUs only for affected customers. Ultimately, everyone could get a replacement.

Fortunately, we were in Austin and I had been buying from Dell, which dispatched someone to our session office to replace our Pentiums.

Years later, I had interviewed a job candidate who had been at Intel when the problem occurred. He described that someone simply made a mistake, but the person assigned to check their work neglected to do the job, and the manager above neglected to make sure it was done. Apparently the person who made the honest mistake was spared, but the checker and a line of managers to nearly the top were all fired for dereliction of duty.

@kenshirriff

I was buying computers for the video game developer I worked at. One department used 3D Studio, and we saw the effects of the Pentium defect clearly on the screen.

At first Intel downplayed the problem, saying nobody would be affected. Then they said they'd replace CPUs only for affected customers. Ultimately, everyone could get a replacement.

Dr. Juande Santander-Vela

@johnlogic @kenshirriff that was surprisingly just for a big corporation, if true!

John Carlsen πŸ‡ΊπŸ‡ΈπŸ‡³πŸ‡±πŸ‡ͺπŸ‡Ί

@juandesant @kenshirriff

Yes; I was impressed with this interviewee's story alleging that Intel had had an internal lightning strike.

Ken Shirriff

@johnlogic I've talked with a few people who worked on the Pentium and I don't think anyone got fired over it. In "The Pentium Chronicles", the error is blamed on a flawed formal proof that misled the testers into thinking a change was safe.

penguin42

@kenshirriff I can imagine they didn't want to move any other block, given that may have meant relaying stuff out and having to do some timing checking etc, so expanding the PLA might have been hard

James Just James

@kenshirriff Fascinating details, thanks! Is there any chance there could have been an intentional reason to create this bug? For example, would it weaken any encryption algorithms at the time, make the chips cheaper to produce, make money on stock shorts for the uncovered failure or any other scenario that was on purpose?

Solarbird :flag_cascadia:

@kenshirriff oooooo that's a neat bit of trivia

Cool work, thanks for sharing :D

Brett Wilson

@kenshirriff I went to a talk c. 2003 from a sr. Intel person who gave a description I have not heard anywhere else:

There was a die size push and somebody said "I can make the divider smaller and here is a mathematical proof that it's correct." People were so impressed by the proof they didn't notice that it was wrong and didn't care that the space savings were inconsequential (he described it as "taking out Missouri doesn't make the US smaller" πŸ˜‚).

This meshes nicely with you missing entries.

Go Up