[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode loops
Richard Wilbur
richard.wilbur at gmail.com
Thu Aug 19 03:00:42 BST 2021
> On Aug 18, 2021, at 18:20, lkcl <luke.leighton at gmail.com> wrote:
[…]
> * cost in registers and memory for HF variant:
>
> - N ln N registers for cos coefficients
> - N registers for input
> - N ln N LDs of coefficients from memory
> - N LDs for input
> - N STs for output
>
> total:
>
> - N + (N ln N) regs
> - 3N + (N ln N) memory accesses
by my count that adds to
2N + (N ln N) memory accesses
> * cost in regs and mem for VF:
>
> - ONE scalar reg for cos coeff
> - N regs for input
> - ZERO LDs for coeffs
> - N LDs for input
> - N STs for output
>
> total:
>
> - N + 1 regs
> - 2N memory accesses
Hence the excess is around (N ln N) for both registers and memory accesses. Still non-trivial overhead.
More information about the Libre-soc-dev
mailing list