Email or username:

Password:

Forgot your password?
Top-level
Ken Shirriff

Here's the microcode for the division loop inside the 8086. It does a lot of subtracts and bit rotates (rotate carry left, RCL). An internal 4-bit counter loops through the bits. The photo shows the counter on the die.

6 comments
Ken Shirriff

Dividing two signed (positive or negative) integers uses more microcode. This microcode makes the divisor and dividend positive, but keeps track of the final sign in an internal flag called F1. After dividing, the quotient's sign is adjusted according to F1.

Ken Shirriff

Later chips use a faster algorithm called SRT. It uses a table to estimate quotient bits two or four at a time. Intel's Pentium chip (1993) missed a few table entries so it occasionally got the answer wrong, the famous FDIV bug. Replacing the bad chips cost Intel $475 million.

Ken Shirriff

Division on the 8086 was very very slow, up to 184 clock cycles due to all the looping. Modern Intel processors are much faster, but division is still slow compared to addition or multiplication. While you can now multiply every clock cycle, divisions need 6-10 clock cycles.

Ken Shirriff

This 8086 die photo shows the main functional blocks. The Arithmetic/Logic Unit (ALU) performs the subtractions and shifts. Microcode is in the ROM at the right. I removed the metal and polysilicon layers for this image so you can see the silicon transistors underneath.

Urethramancer🐀

@kenshirriff $475 million is either a firing offence, or a very expensive way to learn to be careful.

Go Up