[Libre-soc-dev] Reservation Stations. Was [Libre-soc-bugs] [Bug 782] add galois field bitmanip instructions
lkcl
luke.leighton at gmail.com
Tue Mar 8 16:16:25 GMT 2022
here is an explanatory diagram:
https://libre-soc.org/3d_gpu/pipeline_vs_fsms.jpg
on the left, a 10-stage pipeline and (barely enough)
10x ReservationStations.
at the top are 4x Read-Priority-Pickers, 4x Common Data
Buses, plus remember the operand, brings in a whopping
but perfectly reasonable and justifiable THREE HUNDRED
data wires into each and every one of the 10 Reservation
Stations.
why 300?
because 4x 64-bit registers, RA, RB, RC and an SPR
(for the Galois Field Reduced Polynomial) equals a
total of 256 (4x64) and it is not unreasonable to have
another 45 or so wires from the operand decode phase.
that's 300 wires out of each RS into a 10 (TEN) way
Mux, totaling...
three
THOUSAND
wires
into the single-pipeline mux.
"fortunately" at the other end, it would be of the order of 64x10
"only" 640 wires out, to farm-out the results from the pipeline
back to the corresponding (10) RSes, pending the Write-Priority-Picker
to give each of the (10) RSes one single opportunity to write to the
(one) write-port available on the INT Register File.
on the right, a 10-way Finite State Machine, which, obviously,
does not have that completely-mad three THOUSAND wire
Mux.
however to illustrate the point about shared pipelines even in
FSMs, only the first 7 cycles of the FSM produce partial-results,
and the last 3 cycles go into 3-stage "partial-results-processing"
pipelines.
the usual rules apply: if there are 3 stages then there must
be a maximum of 3 FSMs that share that 3-stage pipeline.
if there are greater than that, then the FSMs are guaranteed
to stall.
if there are less than that (2 FSMs share a 3-stage pipeline)
then the pipeline will only ever be 2/3 occupied (at least one
stage is 100% guaranteed to run empty).
it is a delicate balance, a choice between the two.
they are in groups: 2 FSMs in the diagram share a 3-stage
pipeline, meaning there must be QTY 5of such 3-stage
pipelines.
the fan-in and fan-out is absolutely identical to a ReservationStation2
to the extent that the exact same class - unmodified - may be utilised,
here.
the balancing on resources is much better (read: far less completely
insane), in that only having QTY 5of 2-in 2-out MUXes @ 64-bit
(or more like 192-bit) is infinitely preferably to an absolutely-mental
10-way MUX on 300+ wires.
i believe this illustrates the point that pipelines are not under 100%
of all circumstances "better" than FSMs? Jacob, your thoughts?
l.
More information about the Libre-soc-dev
mailing list