Binary multiplication is much like long grade-school...

Binary multiplication is much like long grade-school multiplication, but simpler because each step is either 0 or the multiplicand. A processor can implement this by shifting the number and adding it in each cycle of a loop. For e.g. the 6502, this was done in assembly code.

Like 16 Mar 2023 at 5:26 | Open on oldbytes.space

11 comments

Ken Shirriff

The 8086 processor uses microcode, implementing instructions with an even lower layer of micro-instructions. Each 21-bit micro-instruction could move data between registers and perform an arithmetic operation, a condition, or even a micro-subroutine call.

16 Mar 2023 at 5:26 | Open on oldbytes.space

Ken Shirriff

Here's what the main microcode loop for multiplication looks like. It rotates values right through carry (RRCY) and does ADDs subject to conditions. (Σ is the output from the ALU.) This loop executes 16 times to multiply 16-bit words.

16 Mar 2023 at 5:27 | Open on oldbytes.space

Ken Shirriff

The left rotate through carry (LRCY) and right rotate through carry (LRCY) are key ALU operations in this process. They are like bit-shifts with more functionality. The bit shifted out of the word gets shifted into the carry flag, while the old carry bit enters the word.

16 Mar 2023 at 5:27 | Open on oldbytes.space

Ken Shirriff

Multiplying signed (positive or negative numbers) is more complicated, with more micro-subroutines. This one turns both arguments positive, while tracking the signs in internal flag F1.

16 Mar 2023 at 5:29 | Open on oldbytes.space

Ken Shirriff

Multiplication uses a 4-bit hardware loop counter and the special F1 flag. Here's what those features look like on the 8086 die. I removed the metal for this photo to show the silicon and the polysilicon wiring underneath.

16 Mar 2023 at 5:29 | Open on oldbytes.space

Ken Shirriff

Instead of a loop, modern processors use a bunch of adders arranged in a special tree to perform a multiplication in a single clock cycle. The 8086 was very slow in comparison, taking up to 133 clock cycles for a 16-bit multiplication.

16 Mar 2023 at 5:29 | Open on oldbytes.space

Ken Shirriff

For more information on multiplication in the 8086 and lots more microcode analysis, see my latest blog post https://www.righto.com/2023/03/8086-multiplication-microcode.html

16 Mar 2023 at 5:30 | Open on oldbytes.space

vruz

@kenshirriff This is a genuinely excellent post. Thanks Ken!

16 Mar 2023 at 5:34 | Open on mastodon.social

[DATA EXPUNGED]

Ken Shirriff

@kentindell Some computers, such as the Xerox Alto, let programmers write in microcode, but there are three problems with this. First, writing in microcode is very difficult because it is extremely low-level. Second, if you change the computer's internal architecture, the microcode changes and you need to rewrite it. Finally, rewritable microcode in RAM can have performance problems.

16 Mar 2023 at 15:59 | Open on oldbytes.space

Janne Moren

@kenshirriff

It strikes me how the microcode really isn't very far from what we would call RISC instructions, in scope and complexity. Or plain old 6502 assembler for that matter.

16 Mar 2023 at 5:30 | Open on fosstodon.org

@kenshirriff Wow, that’s so simple. I always imagined multiplication to be much more complex. Thanks for your write ups on all these things, they’re absolutely fascinating

24 Mar 2023 at 20:00 | Open on arvr.social