[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode loops

Richard Wilbur richard.wilbur at gmail.com
Thu Aug 19 03:00:42 BST 2021

> On Aug 18, 2021, at 18:20, lkcl <luke.leighton at gmail.com> wrote:
> * cost in registers and memory for HF variant:
>   - N ln N registers for cos coefficients
>   - N registers for input
>   - N ln N LDs of coefficients from memory
>   - N LDs for input
>   - N STs for output
> total:
> - N + (N ln N) regs
> - 3N + (N ln N) memory accesses

by my count that adds to
  2N + (N ln N) memory accesses

> * cost in regs and mem for VF:
>   - ONE scalar reg for cos coeff
>   - N regs for input
>   - ZERO LDs for coeffs
>   - N LDs for input
>   - N STs for output
> total:
> - N + 1 regs
> - 2N memory accesses

Hence the excess is around (N ln N) for both registers and memory accesses.  Still non-trivial overhead.

More information about the Libre-soc-dev mailing list