[Libre-soc-bugs] [Bug 1155] O(n^2) multiplication REMAP mode(s)

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Fri Dec 22 13:30:55 GMT 2023


https://bugs.libre-soc.org/show_bug.cgi?id=1155

--- Comment #15 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
ok so this is the primary "yielder" function which
fascinatingly, the first two (ai, bi) are covered
by the standard matrix and svindex REMAPs, the other
two are covered by exactly the middle of the three
patterns from comment #0 but just offset by one
in the case of the 4th.

def python_mul_remap_yielder(a_sz, b_sz):
    for ai in range(a_sz):
        for bi in range(b_sz):
            yield ai, bi, ai + bi, ai + bi + 1

i'm almost tempted to suggest that the offset (+1)
be a parameter, but... hmmm... is there space to do
that (in SVSHAPE0-3 i mean)?

that's going to be the next critical task, but i feel
that we may need a few more examples here so as not
to run out of bits in SVSHAPEn, hmm hmm not sure,
tempted also to just go for it

anyway next task definitely to choose a format from comment #2

> space in SHAPE SPRs
> 
> https://libre-soc.org/openpower/sv/remap/
> 
> 
> |0:5   |6:11  | 12:17   | 18:20   | 21:23   |24:27 |28:29  |30:31| Mode  |
> |----- |----- | ------- | ------- | ------  |------|------ |---- | ----- |
> |xdimsz|ydimsz| zdimsz  | permute | invxyz  |offset|skip   |mode |Matrix |
> | rsvd |rsvd  |xdimsz   | rsvd    | invxy 0 |offset|submode|0b10 |Red/Sum|
> | rsvd |ydimsz|xdimsz   | rsvd    | invxy 1 |offset|submode|0b10 |Bigmul |

still TODO:

> have to check z of invxyz is free

(and jacob you still need to implement inversion in Red/Sum, i'm
 not going to do it for you. it doesn't matter what *you* think it
 might or might not be used for: we present people with the option
 and they explore over the next 1-10+ years)

* the offset there (bits 24-27) will work out (the "+1" for example)
* rsvd bits 0:5 can be the "modulo" which if set to 0 means "no modulo"
  this will give the repeating modulo pattern
* permute (bits 18:20) will say what the order of the for-loops is
  ("for ai for bi" vs "for bi for ai")
* submode specifies triangle or rhombus
* invxy says whether ai goes 0-xdimsz or xdimsz-0 and likewise bi

it's all there i believe?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list