[Libre-soc-dev] [RFC] merging parallel reduction into REMAP

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sun Aug 1 10:59:37 BST 2021


https://libre-soc.org/openpower/sv/svp64/appendix/?updated#index14h1

i'm looking at the parallel reduction algorithm and note that it is
remarkably similar to the REMAP schedule for DCT COS table generation.

   8 4 2 1

which is exactly the kind of thing i was looking for, to make general
abstractions.

the first issue is, however, that it is not ok to have two separate
and distinct operations.

the parallel reduxtion pseudocode has two operations:

1) the operation requested
2) a MV operation

the MV has to go.

a trick i have been using in the simulator "yield" iterators is to
create redirection lookup indices.  i am reasonably confident that
these can be blatted down to O(1) at gate level, however they give an
idea:

instead of MVing the data, use the predicate bits to sequentially
"step over" the data:

j = 0
for i, pbit in enumerate(predicate_bits):
  if pbit == 1:
    lookup[j] = i
    j += 1

then use lookup[index] in all register accessing.

i will update the pseudocode with this idea, to see what it looks like.

l.



More information about the Libre-soc-dev mailing list