[Libre-soc-dev] FFT, DCT, REMAP
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Jun 21 03:13:39 BST 2021
On 6/21/21, Jacob Lifshay <programmerjake at gmail.com> wrote:
> I think the easiest solution is to do the FFT by doing the fmadds and
> fmsubs while copying from one array to another instead of trying to do
> an in-place mul-add/sub.
this is already achievable so does not need discussion except for
comparison because no changes or enhancements are required to
implement that (mot as efficient) algorithm.
due to the long length of FFTs it cuts the maximum length in half.
also it means only the very inner loop can be covered. fnmadds in one
batch (one loop), fmsubs in the next.
this can be anyway without further enhancements.
this proposal is to consider enhanceements which result in significant
resource savings such as being able to do double the length without
costly register spill.
and to quite literally use one instruction where dozens would
otherwise be needed.
if this was not such a highly strategic algorithm used in video
processing audio processing convolutions radar radio wifi LTE pretty
much every commercial area which is high value i would not have raised
to give some idea, TI almost certainly designed their VLIW DSP with 14
instructions and added Zero-Overhead Looping precisely to be able to
do FFT inner loop with 100% IPC, no stalls.
therefore the idea *is* to consider *how* it xan be done rather than
take the easy/lazy way out.
then search around and see if there are other algorithms for which a
butterfly pattern is useful.
cryptographic applications particularly using Galois Field variant of
mul-add springs to mind.
More information about the Libre-soc-dev