[Libre-soc-dev] FFT, DCT, REMAP

Jacob Lifshay programmerjake at gmail.com
Mon Jun 21 00:52:11 BST 2021


On Sun, Jun 20, 2021 at 2:41 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>
> it occurred to me from looking at imdct36 in ffmpeg that the REMAP
> facility may be able to autogenerate the array indices for FFT, both
> inner and outer loops.
>
> this code shows how simple the loops are:
> https://www.nayuki.io/res/free-small-fft-in-multiple-languages/fft.py
>
> in SVSTATE, srcstep can be used for the inner loop, dststep for the
> outer while loop, and perhaps the offset field could be put to use as
> a 3rd loop.
>
> or: in theory with REMAP having 3 dimensions it should be possible to
> use src/dststep for all three, the i j and the while loop, for a total
> of up to 64 elements.
>
> for longer than that, 2 dimensions could be used with a manual
> external while loop.
>
> the only fly in the ointment: there are *TWO* mul-accumulates, of
> opposite sign.  SV was never designed for this scenario.

I think the easiest solution is to do the FFT by doing the fmadds and
fmsubs while copying from one array to another instead of trying to do
an in-place mul-add/sub.

For a good algorithm, see Listing 3 in
https://www.spiral.net/doc/papers/ffte-spiral.pdf (FFTE on SVE:
SPIRAL-Generated Kernels -- a paper describing how to implement FFT
for Arm SVE).

> a workaround is to use the  OE flag on madd, fmadd and fmadds to
> indicate that those operations should be modified suitably.

I think that's not the best idea --- it seems very semantically messy.

Jacob



More information about the Libre-soc-dev mailing list