[Libre-soc-bugs] [Bug 230] Video opcode development and discussion

Sun Dec 20 20:09:45 GMT 2020

https://bugs.libre-soc.org/show_bug.cgi?id=230

--- Comment #64 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to cand from comment #62)
> That's the normal SV setup, while we were discussing horizontal add. For the
> horz-add case, VL would be big for RA and small for RB.

.... *click*.  you mean map-reduce.  the word "horizontal" confused me,
different term.  got it.  yes, sorry, i completely lost the context, sorry. 
(small request: could you use the "reply" button, it will do the usual email
">" thing. i would probably still have got it wrong, with the word
"horizontal", but hey... :)

ok so let's go over this.  firstly: VL is the outer for-loop, it doesn't change
for RA, RB or RT.  did you mean "big elwidth for RA, small elwidth for RB"
instead?

to cover vec_msum, i've added a "reduce" mode to SV.  this:

     mul RT, RA, RB reducemode,VL=3

would result in:

     RT = (RA[0]*RB[0]) * (RA[1]*RB[1]) * (RA[2]*RB[2])

but... you're looking for a *different* elwidth on RA from RB, is that right?
if so, that's.... yeah, very tough to fit into the SV paradigm, because
elwidths apply in "arithmetic" cases to the whole operation, and in "MV"
cases you have a src elwidth and a dest elwidth.

but, arithmetic operations with different elwidths? this requires three:

* src1 elwidth
* src2 elwidth
* dest elwidth

and that's too much.

it *might* however be ok to jam in that gather-add you suggested:

     gather-add: d = a.x + a.y + a.z + a.w

this will be hair-raising but i think it's doable, based on a bit in the Mode
to say "vec2/3/4 reduce is in effect".

would this work?

     for i in range(VL):
          iregs[RT+i] = 0
          for j in range(SUBVL):
              iregs[RT+i] += iregs[RA+i*SUBVL+j]

-- 
You are receiving this mail because:
You are on the CC list for the bug.