[Libre-soc-dev] MP3 DCT36

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sat Jun 19 17:02:09 BST 2021

Lauri, i'm looking at this:

and other than the (newly) introduced "reverse gear" i'm really not
seeing anything else needed (oh, apart from 128 registers).

* the for-loop 17-1 and 17-2 can be done with reverse-gear,
  where the 17-2 can use predication 0b0101010101
* the for-loop 0..2 is a straight setvli=2. everything referencing
  in1 is always 2* i.e. the first loop never uses values of the 2nd
* the loop 0..4 again hm as long as tmp is put back onto stack
  then reloaded with offsets +4+4+4+4
  it should be possible to hit all 4 in one single go, with setvli=4

actually... you know what? i have a sneaking suspicion... that
code at the end (outside of the loop x4) looks identical to the bits
inside.  therefore, i suspect, it may be possible to use a *FIVE*
loop, but set a predicate mask for s0/s1 of 0b11111 and a
predicate mask for s1/s2 of 0b01111 and to cut the last part
out completely (lines 373-380).

what do you think?


More information about the Libre-soc-dev mailing list