[Libre-soc-dev] MP3 DCT36

Luke Kenneth Casson Leighton lkcl at lkcl.net
Thu Jun 17 14:51:53 BST 2021

On 6/17/21, Lauri Kasanen <cand at gmx.com> wrote:
> On Thu, 17 Jun 2021 11:49:42 +0100
> Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:
>> thus it is a bit of a pain in the ass to calculate, and unless there is
>> a really compelling reason i'd like to leave it for the "advanced"
>> version if that's ok.
> Yeah not needed for now.

on reflection i am actually really tempted to implement it straight away.

the reason is because it means that both DCT and MMult can be *in-place*

right now, for DCT, you have to do a LD/ST offset by 4, right? (actually 4x4)

but, in some places, you might actually have to *reorder* the
registers, doing so manually, am i right?

like this:

fmv f0, f24
fmv f4, f25
fmv f8, f26
fmv f12, f27

which is not only terribly wasteful of instructions, you can't
optimise it, you *have* to use more registers.

REMAP will allow those +4+4+4+4 jumps on *ONE* register (src1) whilst
on another (dest) it is doing +1+1+1+1

therefore the data can stay IN-PLACE whilst still doing long vector operations.

and the beauty of it is, you can even do *three* dimensions of REMAP,
at any arbitrary size on each.  not restricted to power-of-two.

> I don't even understand it after reading the
> page several times.

the linear for-loop instead of 0 1 2 3 4 ...

you can do 0 4 8 1 5 9 2 6 10 3 7 11

or 0 3 6 9 1 4 7 10 2 5 8 11

or absolutely anything.

> BTW, what was that about a setvl bug changing 8 to 7?

ah that was me screwing up the definition and pseudocode in setvl.

i'll sort that out.


More information about the Libre-soc-dev mailing list