Faster Multiplication (Cont.)

Using a Parallel Tree
This approach organizes the 32 additions in a parallel tree. Rather than using a single 32-bit adder 31 times, this hardware “unrolls the loop” to use 31 adders and then organizes them to minimize delay. Instead of waiting for 32 add times, only log2(32) or five 32-bit add times are needed.


Carry Save Adders
Multiply can go even faster than five add times because of the use of carry save adders and because it is easy to pipeline such a design to be able to support many multiplies simultaneously. An example is given here. It works as follows:
  • All the bits of a carry-save adder work in parallel. The carry does not propagate as in a carry-propagate adder.
  • A carry-save adder has 3 inputs and produces two outputs. It adds 3 numbers and produces partial sum and carry bits.


Review: Faster Multiplication
    Which method is NOT used to speed up multiplication?

      Using a parallel tree
      Using carry save adders
      Using forwarding units
      Using multiple adders
Result:        




      Q: What did the elephant say to the naked man?    
      A: “How do you breathe through something so small?”