[Libre-soc-isa] [Bug 697] SVP64 Reduce Modes

Wed Feb 2 13:01:30 GMT 2022

https://bugs.libre-soc.org/show_bug.cgi?id=697

--- Comment #12 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #11)
> the algorithm we'd want to use is based on the algorithm without indirection
> through the vi table:
> def reduce(vl, vec, pred):
>     step = 1;
>     while step < vl
>         step *= 2;
>         for i in (0..vl).step_by(step)
>             other = i + step / 2;
>             other_pred = other < vl && pred[other];
>             if pred[i] && other_pred
>                 vec[i] += vec[other];
>             else if other_pred
>                 vec[i] = vec[other];
>             pred[i] |= other_pred;
>     # scalar result is now in vec[0]
> 
> idk why the version with vi was added to the wiki, 

because it is not possible to have two completely separate
types of operations (one a MV, the other arithmetic), and
because internal state is not permitted (or practical)
except that which will fit into SVREMAP SPRs.

if it is wrong, and you want this in SVP64, it is your responsibility
to work through the full implementation details.

> but it destroys most of
> the benefits of the above version. the above version specifically reduces in
> a tree pattern where changes in vl/pred don't affect where in the tree
> add/copy are performed, allowing a cpu to efficiently implement tree
> reduction without needing a full crossbar on the alu inputs or outputs. when
> some part of pred is 0, it changes the adds to copies, but doesn't change
> the position or order of any other operations.

"changing adds to MVs" is not permitted.  changing the input parameter
on an add so that is in *equivalent* to a MV is fine.

however imposing on the decoder to drop 1 (or 1.0) into the appropriate
operation, to make any given operation a MV, this is pretty dicey especially
when it comes to reduce on CR operations.

therefore, on balance, it's not happening (too complex)

an algorithm which tracks where the operand is and therefore *does not
need the MV at all* would on the other hand be perfectly fine, as long as,
again, it fits into SVREMAP.

(yes, really, the DCT REMAP does some incredibly strange-looking non-linear
indices remapping, don't ask me to explain it here, it took me 6+ weeks to work
out)

so, please can you rewrite the algorithm in a form that uses yield iterators
and does not require a MV. yes this means having a special index array that
allows de-referencing of non-masked elements, and i suspect that will be
perfectly fine.

will post separately with the basic principle

-- 
You are receiving this mail because:
You are on the CC list for the bug.