[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode loops
luke.leighton at gmail.com
Wed Aug 18 23:16:20 BST 2021
On August 18, 2021 10:02:49 PM UTC, Richard Wilbur <richard.wilbur at gmail.com> wrote:
>On Aug 18, 2021, at 13:06, lkcl <luke.leighton at gmail.com> wrote:
>> basically, to do large DCT / FFT recursively, you split into two
>halves, do each half at half the DCT/FFT size, then recombine the
>Each half could use the same scalar coefficients.
could... but remember: FFT of size N you need N coefficients. now you can only hold in regfile half an FFT as if you did it with Vertical-First Mode
for DCT it is *N ln N* coefficients needed for a DCT of size N. DCT of size 32 needs 32+16+8+4+2+1 registers for the COS coefficients!
we just used the ENTIRE regfile!
you can use only 1/2 the regfile and do a 64-wide DCT
> Seems for a
>particular size data set that if we are doing recursive sizes of
>transforms to compute the transforms. If they are always related by
>powers of two then one time calculating the coefficients should be
>sufficient if we could calculate them and store them either in the
>order they are used (in a non-destructive FIFO with capability to set a
>step size) or with an easy scheme to access them via an index, we might
>at once calculate the coefficients using our vector engine and then use
DCT unfortunately doesn't work that way. in order to complete all butterflies you need, in each row, cos((i+0.5)/n) from i=0..n-1 where n goes up in powers of two per butterfly row.
you can share those values *in* a row but unlike an FFT you cannot *reuse* them on a *different* row due to the +0.5
>If we had such a coefficient cache, I think VFHint could still be
interesting idea, to have a special separate cache for coefficients. it is however pretty specialist. if it really becomes really a focus for performance it's worth pursuing.
right now issuing cos instructions is "generic". specialist single-purpose instructions make me twitchy.
for 3D texture interpolation it's fine / great / obvious payoff.
More information about the Libre-soc-dev