[libre-riscv-dev] buffered pipeline
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Thu Mar 14 11:43:40 GMT 2019
there's... something about this that doesn't feel right, which perhaps a
more comprehensive test will pick up.
i *think* it's down to the use of combinatorial logic for the BSY/STB
signals, which in a long pipeline will result in an ever-increasing
propagation delay that will dramatically reduce the maximum clock rate.
as an example, mitch was involved in the AMD K9 architecture which had a
requirement of a mere 16 gates chained together on any given stage.
to understand that more: it looks really simple, at the moment, just chain
the BSY/STB lines together, because it's a simple example and no actual
stalling is required (or implemented) in any given stage.
however if say a given stage has particularly complex analysis logic for
whether the stage should stall or not, that complex logic *accumulates* and
propagates up and down the entire pipeline chain.
this is what dan was talking about in his post
i *believe* you may have implemented the "simple handshake" protocol.
chaining several stages and throwing ten thousand values at the input will
help determine a bit more.
also, see the attached screenshot, there's a spike which has me very
concerned. that really *really* should not be happening, as it will cause
ASICs push the boundary on what can be fitted into a given clock pulse: the
inputs absolutely have to be stable at the moment the clock rises!
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 31966 bytes
Desc: not available
More information about the libre-riscv-dev