[Libre-soc-isa] [Bug 697] SVP64 Reduce Modes
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Wed Feb 2 12:01:04 GMT 2022
https://bugs.libre-soc.org/show_bug.cgi?id=697
--- Comment #10 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
def reduce( vl, vec, pred, pred,):
j = 0
vi = [] # array of lookup indices to skip nonpredicated
for i, pbit in enumerate(pred):
if pbit:
vi[j] = i
j += 1
step = 2
while step <= vl
halfstep = step // 2
for i in (0..vl).step_by(step)
other = vi[i + halfstep]
i = vi[i]
other_pred = other < vl && pred[other]
if pred[i] && other_pred
vec[i] += vec[other]
pred[i] |= other_pred
step *= 2
would turn into something like this:
def i_yielder(....)
j = 0
vi = [] # array of lookup indices to skip nonpredicated
for i, pbit in enumerate(pred):
if pbit:
vi[j] = i
j += 1
step = 2
while step <= vl
halfstep = step // 2
for i in (0..vl).step_by(step)
other = vi[i + halfstep]
i = vi[i]
other_pred = other < vl && pred[other]
if pred[i] && other_pred
yield i
pred[i] |= other_pred
step *= 2
def other_yielder(....)
j = 0
vi = [] # array of lookup indices to skip nonpredicated
for i, pbit in enumerate(pred):
if pbit:
vi[j] = i
j += 1
step = 2
while step <= vl
halfstep = step // 2
for i in (0..vl).step_by(step)
other = vi[i + halfstep]
i = vi[i]
other_pred = other < vl && pred[other]
if pred[i] && other_pred
yield other
pred[i] |= other_pred
step *= 2
and now that fits the SVREMAP pattern
for i in range(some_function_of(VL)):
RA_offs = yield reduction_yielder_for_RA() # other
RB_offs = yield reduction_yielder_for_RB() # i
RT_offs = yield reduction_yielder_for_RT() # also i
regs[RT+RT_offs] = regs[RA+RA_offs] + regs[RB+RB_offs]
except for the bit about pred[i] |= other_pred which i will leave you
to work out.
bear in mind that it's fine to be [much] slower restarting in the
middle of a context-switch. so if the pred[i] has to be stored in
throw-away state which is re-computed, that's perfectly fine.
[where overloading some arbitrary register or adding yet another
SPR to store it is not fine]
(matrix-multiply REMAP has to take a long time to work out
where it was interrupted, for non-power-two, because you
can't exactly do a cascading set of MOD/DIV operations in
one clock cycle)
also bear in mind that in Rc=1 mode if the predicate being
relied on by the algorithm is overwritten by Rc=1 then you may
end up with corrupted results (even without an interrupt in the
middle)
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list