[Libre-soc-dev] parallel reduction

lkcl luke.leighton at gmail.com
Wed Sep 7 18:53:52 BST 2022

predication on REMAP schedules is much more complex
than first appears.

it is not just parallel-prefix, it is the fact that *all* REMAPs
are operation-based but that the predicate masks are only
useful as bit-lookups *after* the remapped index is calculated:

predicatebit = mask[remap(srcstep)]

even more complex is that matrix multiply needs *three*
separate and distinct predicates! one for xdim, one for ydim,
one for zdim.

my feeling is therefore that this needs closing the parallel
reduction issue as completed, and a lot of thought put into
this after mid-october.

altering the current predication system is off the table, it has
been a lot of work and there is a case for keeping it, as it controls
individual operations which is useful for remote deterministic
processing (Snitch, Eth-Zurich).


More information about the Libre-soc-dev mailing list