[Libre-soc-dev] [RFC] REMAP, SHAPE, complex (full) FFT
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Jul 19 12:29:15 BST 2021
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
On Mon, Jul 19, 2021 at 10:20 AM Hendrik Boom <hendrik at topoi.pooq.com> wrote:
> I'm not sure how helpful I can be, but I'd like to have a look at the
that'd be fantastic.
> Could you show me the versions of the recursive and buttefly code you
> are trying to iterativize?
and associated test program:
these now work. very frustrating to have worked out, but they're working.
iterative DCT, but *not* in-place.... yet.
now... this diagram could be wrong, but notice the inputs are in
order 0 1 2 3.... 7 but the output is in bit-reversed order 0 4 2 6 1 5 3 7?
now notice how the outer butterly does **NOT** have to have bit-reversing
done after it?
that's because i *PRE-LOADED* the data in bit-reversed order...
*and then created an index lookup table to invert that, in-place*
during the inner butterfly.
however... sigh... the original code does this:
0 MUL 7 --> +/- 0 4
1 MUL 6 -> +/- 1 5
2 MUL 5 -> +/- 2 6
3 MUL 4 -> +/- 3 7
which of course is not in-place, at all. the first half is in-place, but
the second half is reverse-number order:
* 7 goes in position 4
* 6 in position 5
* 5 in position 6
* 4 in position 7
the next trick to investigate, therefore, is to *pre-load* the data with
a *pre-reversed* top half, such that the MULs can be done as:
0 MUL 7 --> +/- 0 7
1 MUL 6 -> +/- 1 6
2 MUL 5 -> +/- 2 5
3 MUL 4 -> +/- 3 4
> And any iterative versions you've been able to derive from them?
> If I remember correctly, the essence of FFT is recursive decomposition,
> and making it iterative is likely tricky.
it's a standard computer science trick from Functional Programming
that any recursive algorithm may be proven to be the same as its
FFT has already been done as an iterative variant: solved many years
ago and even in-place, the trick being there to *bit-reverse* the indices
of the data being loaded *while* loading it into memory. algorithm can
be found here:
where the original is here:
More information about the Libre-soc-dev