Email or username:

Password:

Forgot your password?
Ken Shirriff

You'd think every computer should be able to divide two numbers, but early microprocessors didn't have division instructions. The Intel 8086 (1978) was one of the first with division. Let's look at how it implemented division and why division is so hard.

11 comments
Ken Shirriff

Computers can divide by performing long division, just like grade school except using binary. This needs a subtract-and-shift loop. For early microprocessors, you'd implement the loop in assembly code. The 8086 implemented the loop in microcode, much faster and more convenient.

Ken Shirriff

Many CPUs use microcode internally: a level of code even lower than machine code. Microcode specifies each step of a machine instruction. Each 8086 micro-instruction is 21 bits long, performing a data move and an action in parallel. Microcode is low-level & hard to understand.

Ken Shirriff

Here's the microcode for the division loop inside the 8086. It does a lot of subtracts and bit rotates (rotate carry left, RCL). An internal 4-bit counter loops through the bits. The photo shows the counter on the die.

Ken Shirriff

Dividing two signed (positive or negative) integers uses more microcode. This microcode makes the divisor and dividend positive, but keeps track of the final sign in an internal flag called F1. After dividing, the quotient's sign is adjusted according to F1.

Ken Shirriff

Later chips use a faster algorithm called SRT. It uses a table to estimate quotient bits two or four at a time. Intel's Pentium chip (1993) missed a few table entries so it occasionally got the answer wrong, the famous FDIV bug. Replacing the bad chips cost Intel $475 million.

Ken Shirriff

Division on the 8086 was very very slow, up to 184 clock cycles due to all the looping. Modern Intel processors are much faster, but division is still slow compared to addition or multiplication. While you can now multiply every clock cycle, divisions need 6-10 clock cycles.

Ken Shirriff

This 8086 die photo shows the main functional blocks. The Arithmetic/Logic Unit (ALU) performs the subtractions and shifts. Microcode is in the ROM at the right. I removed the metal and polysilicon layers for this image so you can see the silicon transistors underneath.

Urethramancer🐀

@kenshirriff $475 million is either a firing offence, or a very expensive way to learn to be careful.

Minoru Saba

@kenshirriff Thanks for jogging distant memories of microcode programming the instruction set for a prototype minicomputer for a subsidiary of a British conglomerate now long gone. Vaguely remember that implementing multiplication was relatively easy; trying to reduce the microcode steps in the division loop to make it go faster was hard.

Dave Bittner

@kenshirriff I have a vague recollection of the 6809 having an advantage over the 6502 when it came to being able to divide more quickly. Interesting stuff!

Go Up