[Fpga-synth] Terminology Question

Dave Manley dlmanley at sonic.net
Thu Jun 7 18:07:34 CEST 2007


Scott Gravenhorst wrote:
> 
> 1) Why is this called "pipelining"?  I'm interested in a historical
> perspective if there is one.
> 
> 2) I see WebPACK telling me that the "performance of muliplier XYZ can be
> improved by using pipelining".  However, I don't understand what is meant
> by "improved performance" when if the multipliers are strung together, the
> result can actually be available sooner than if the multipliers are all
> cascaded through pipelining.

Pipelining is used on combinatorial paths with large delay to cut up the 
delay into a few (or many) shorter delays.  This allows the overall 
through-put to increase.

Take for example a combinatorial logic function that takes 1uS to 
complete - with the clock at 1MHz you get 1 result every microsecond.

Now divide the logic function into four stages (this isn't always 
easy!), each of which has a delay of ~250nS.  Now the clock can run at 
4MHz and you get 4 results per microsecond.

My first exposure to this was in the design of microprogrammed CPUs. 
The path from address fetch, to instruction decode, to ALU operation can 
all run in a single tick but it will be slow.  Instead you can put a 
register after each stage and make it run much faster.

I don't know the origin of the term but would guess it probably started 
in CPU design at one of the big iron mainframe manufacturers like Univac 
or IBM.  Engineers often aren't the best at giving things names so I'm 
not surprised the term is a little ambiguous, but basically the term is 
acccurate.  Consider you're not putting water into the pipe, but 
discrete sized things like ping-pong balls. At each tick you push a ball 
into the pipe, and one comes out the other end.  The length of the pipe 
determines the number of balls or equivalently the number of pipe-stages.

With regard to the Xilinx multipliers, if your design already registers 
the data going into the multiplier, and then registers the result as it 
pops out, then the design is effectively already pipelined - you 
minimized the amount of combinatorial logic between register stages.  If 
on the other hand you had an adder that fed the input of the multiplier 
  and the output of the multiplier fed another adder before the result 
was registered, then you could make the design faster by putting 
pipeline registers like this: reg->adder->reg->mult->reg->adder->reg. 
The best performance will be to use the dedicated registers that are 
built into the input and output of the multipliers (if they're present).

HTH,
Dave




More information about the Fpga-synth mailing list