[Libre-soc-dev] FFT, DCT, REMAP
programmerjake at gmail.com
Mon Jun 21 00:52:11 BST 2021
On Sun, Jun 20, 2021 at 2:41 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
> it occurred to me from looking at imdct36 in ffmpeg that the REMAP
> facility may be able to autogenerate the array indices for FFT, both
> inner and outer loops.
> this code shows how simple the loops are:
> in SVSTATE, srcstep can be used for the inner loop, dststep for the
> outer while loop, and perhaps the offset field could be put to use as
> a 3rd loop.
> or: in theory with REMAP having 3 dimensions it should be possible to
> use src/dststep for all three, the i j and the while loop, for a total
> of up to 64 elements.
> for longer than that, 2 dimensions could be used with a manual
> external while loop.
> the only fly in the ointment: there are *TWO* mul-accumulates, of
> opposite sign. SV was never designed for this scenario.
I think the easiest solution is to do the FFT by doing the fmadds and
fmsubs while copying from one array to another instead of trying to do
an in-place mul-add/sub.
For a good algorithm, see Listing 3 in
https://www.spiral.net/doc/papers/ffte-spiral.pdf (FFTE on SVE:
SPIRAL-Generated Kernels -- a paper describing how to implement FFT
for Arm SVE).
> a workaround is to use the OE flag on madd, fmadd and fmadds to
> indicate that those operations should be modified suitably.
I think that's not the best idea --- it seems very semantically messy.
More information about the Libre-soc-dev