[Libre-soc-dev] MP3 DCT36
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Thu Jun 17 14:51:53 BST 2021
On 6/17/21, Lauri Kasanen <cand at gmx.com> wrote:
> On Thu, 17 Jun 2021 11:49:42 +0100
> Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:
>
>> thus it is a bit of a pain in the ass to calculate, and unless there is
>> a really compelling reason i'd like to leave it for the "advanced"
>> version if that's ok.
>
> Yeah not needed for now.
on reflection i am actually really tempted to implement it straight away.
the reason is because it means that both DCT and MMult can be *in-place*
right now, for DCT, you have to do a LD/ST offset by 4, right? (actually 4x4)
but, in some places, you might actually have to *reorder* the
registers, doing so manually, am i right?
like this:
fmv f0, f24
fmv f4, f25
fmv f8, f26
fmv f12, f27
which is not only terribly wasteful of instructions, you can't
optimise it, you *have* to use more registers.
REMAP will allow those +4+4+4+4 jumps on *ONE* register (src1) whilst
on another (dest) it is doing +1+1+1+1
therefore the data can stay IN-PLACE whilst still doing long vector operations.
and the beauty of it is, you can even do *three* dimensions of REMAP,
at any arbitrary size on each. not restricted to power-of-two.
> I don't even understand it after reading the
> page several times.
the linear for-loop instead of 0 1 2 3 4 ...
you can do 0 4 8 1 5 9 2 6 10 3 7 11
or 0 3 6 9 1 4 7 10 2 5 8 11
or absolutely anything.
> BTW, what was that about a setvl bug changing 8 to 7?
ah that was me screwing up the definition and pseudocode in setvl.
i'll sort that out.
l.
More information about the Libre-soc-dev
mailing list