[Libre-soc-dev] [RFC] Matrix and DCT/FFT SVP64 REMAP
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Tue Jul 6 09:53:45 BST 2021
On Tue, Jul 6, 2021 at 3:42 AM Richard Wilbur <richard.wilbur at gmail.com> wrote:
> I suppose this is where having the semantics to code a dedicated “0”
> source in the register specification could be useful to allow the dispatcher
> to tell the vector unit to send 0’s for a particular operand.
> (Avoiding the explicit initialization of C.)
the general rule for SVP64 is (which has just been obliterated out
of necessity with a special DCT/FFT butterfly instruction):
no new opcodes.
or more strictly:
*definitely* no new opcodes that involve "re-interpretation"
of base (scalar) 32-bit v3.0B ones.
that rule was broken for the very first time with a bit-reverse
LD (to add an RC shift field, partly-embedded into the LD
immediate) and i can tell you for free it will be a royal nuisance.
the mess it's made of PowerDecoder2 is... gaah.
however in this particular case (C=0) the cost of modifying
the 4-operand operations or of adding a new one is so
expensive that the "justification" cost is almost 100% likely
to be too high.
fnmadd etc. sit within an A-Form, maddhd etc. sit within
a VA-Form. these are extremely expensive in terms of opcode
space, to the point where maddhd etc. don't even have Rc=1
having an explicit zeroing series of instructions to initialise
the matrix to zero *when it is needed* is perfectly acceptable.
i mean, we're talking 5 instructions, now, for an arbitrary-sized
runtime-selectable matrix multiply (up to 64 FMACs).
in scalar form - a fixed size 4x4 matrices - it comes out at
80 instructions. enable -O3 and it jumps to a whopping 340
(full explicit 3-nested loop-unrolling occurs)
given that we're looking at greater than an 8x reduction in
code size against the non-optimised version and a stunning 60x
reduction against the compiler-optimised version i don't believe
it's worthwhile pursuing further optimisations for zeroing.
More information about the Libre-soc-dev