[Libre-soc-bugs] [Bug 413] DIV "trial" blocks are too large

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Fri Jul 3 16:37:20 BST 2020


https://bugs.libre-soc.org/show_bug.cgi?id=413

--- Comment #15 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
i think i have a solution, it is simper than previously imagined.

remember:

there are multiple ReservationStations but only one pipeline.  there is a
Priority muxer that selects one and only one RS INPUT, allowing it to send its
data into the front of the pipeline.

the muxid NORMALLY remains the same throughout the entire pipeline so that at
the output stage the result may be dropped into the correct RS Output latch.

so the modifications are as follows:

* 4x the number of RSes are created.  if 8 initially, 32 are created.

* RS output 8 thru 31 are wired *directly* to RS input 8 thru 31.  8 to 8 etc.

* one of the pipeline stages *MODIFIES* ctx.mid (the RS muxid) *DIRECTLY* (by
adding or subtracting multiples of 8 to it). when (or if) this is done does not
matter.

* for 8/16/32 bit operations no modification to muxid is needed.

* for 64 bit operations on the first run through, 8 is added to the muxid. 
this causes the result to be fed into RS Output 8-15 instead of 0-7.  also the
operation is marked "Phase 2".

* likewise on the 2nd and 3rd.

* for the final phase, 24 is *SUBTRACTED* from the muxid by the last pipeline
stage.

this last action will cause the data, now hsving been through 4 pipelines, to
drop into RS 0-7 which is the output to the CompUnits 0-7

so the only modifications needed to DIV are to be able to specify a range of
shift amounts rather than one static one.

each time through the pipelines will select one of those shift amounts.

this is where it gets slightly complex, because the last thing we want is a
full dynamic shifter.

instead, what might be possible is to use a shift cascade, by analysing and
subtracting the amounts between the static shifts and activating them
combinatorially depending on the loop number:

shift1 = 5 # constant
shift2 = 20 # constant

therefore:

if pipelooprun == 1 or == 2:
    out1 = in << 5
if pipelooprun == 2
    out2 = out1 << 15 # 20-5

etc.

this assumes that the shift amounts are incrementing on each loop.  given that
it is just wires (muxes) it should not cause too much gate delay

however if we try to do radix greater than 8 it may be a problem

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list