[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode loops

Wed Aug 18 23:02:49 BST 2021

On Aug 18, 2021, at 13:06, lkcl <luke.leighton at gmail.com> wrote:
> basically, to do large DCT / FFT recursively, you split into two halves, do each half at half the DCT/FFT size, then recombine the results.

Each half could use the same scalar coefficients.  Seems for a particular size data set that if we are doing recursive sizes of transforms to compute the transforms.  If they are always related by powers of two then one time calculating the coefficients should be sufficient if we could calculate them and store them either in the order they are used (in a non-destructive FIFO with capability to set a step size) or with an easy scheme to access them via an index, we might at once calculate the coefficients using our vector engine and then use them on each of the subdivisions of the transform below a certain size—avoiding hammering cache and external memory for the coefficients in the process!

[…]

> which in turn makes the VFHint field also kinda unnecessary.

If we had such a coefficient cache, I think VFHint could still be useful.