Thus, a simple add takes 3 micro-instructions (and 3 clock cycles). The microcode is very generic: it doesn't know the particular registers, the ALU operation, or the operand size. It just specifies the steps and the hardware figures out the details. This keeps microcode small.
Adding to memory "ADD [SI],AX" uses the same microcode. A microcode subroutine gets the SI address and reads (R) from memory. Now M represents the memory value. The microcode adds, then falls through to write (W) the result to memory due to the writeback (WB) condition.