[Libre-soc-isa] [Bug 697] SVP64 Reduce Modes

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Wed Feb 2 12:01:04 GMT 2022


https://bugs.libre-soc.org/show_bug.cgi?id=697

--- Comment #10 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
def reduce(  vl,  vec, pred, pred,):
    j = 0
    vi = [] # array of lookup indices to skip nonpredicated
    for i, pbit in enumerate(pred):
       if pbit:
           vi[j] = i
           j += 1
    step = 2
    while step <= vl
        halfstep = step // 2
        for i in (0..vl).step_by(step)
            other = vi[i + halfstep]
            i = vi[i]
            other_pred = other < vl && pred[other]
            if pred[i] && other_pred
                vec[i] += vec[other]
            pred[i] |= other_pred
         step *= 2

would turn into something like this:

def i_yielder(....)
    j = 0
    vi = [] # array of lookup indices to skip nonpredicated
    for i, pbit in enumerate(pred):
       if pbit:
           vi[j] = i
           j += 1
    step = 2
    while step <= vl
        halfstep = step // 2
        for i in (0..vl).step_by(step)
            other = vi[i + halfstep]
            i = vi[i]
            other_pred = other < vl && pred[other]
            if pred[i] && other_pred
                yield i
            pred[i] |= other_pred
         step *= 2

def other_yielder(....)
    j = 0
    vi = [] # array of lookup indices to skip nonpredicated
    for i, pbit in enumerate(pred):
       if pbit:
           vi[j] = i
           j += 1
    step = 2
    while step <= vl
        halfstep = step // 2
        for i in (0..vl).step_by(step)
            other = vi[i + halfstep]
            i = vi[i]
            other_pred = other < vl && pred[other]
            if pred[i] && other_pred
                yield other
            pred[i] |= other_pred
         step *= 2

and now that fits the SVREMAP pattern

for i in range(some_function_of(VL)):
     RA_offs = yield reduction_yielder_for_RA() # other
     RB_offs = yield reduction_yielder_for_RB() # i
     RT_offs = yield reduction_yielder_for_RT() # also i

     regs[RT+RT_offs] = regs[RA+RA_offs] + regs[RB+RB_offs]

except for the bit about pred[i] |= other_pred which i will leave you
to work out.

bear in mind that it's fine to be [much] slower restarting in the 
middle of a context-switch. so if the pred[i] has to be stored in
throw-away state which is re-computed, that's perfectly fine.
[where overloading some arbitrary register or adding yet another
 SPR to store it is not fine]

(matrix-multiply REMAP has to take a long time to work out
 where it was interrupted, for non-power-two, because you
 can't exactly do a cascading set of MOD/DIV operations in
 one clock cycle)

also bear in mind that in Rc=1 mode if the predicate being
relied on by the algorithm is overwritten by Rc=1 then you may
end up with corrupted results (even without an interrupt in the
middle)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list