[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode loops
luke.leighton at gmail.com
Thu Aug 19 14:22:35 BST 2021
On August 19, 2021 2:00:42 AM UTC, Richard Wilbur <richard.wilbur at gmail.com> wrote:
>by my count that adds to
> 2N + (N ln N) memory accesses
>Hence the excess is around (N ln N) for both registers and memory
>accesses. Still non-trivial overhead.
when strip-mining occurs it is as if you had no L1 cache at all: LD/ST runs 3-5 times slower. even bigger, you strip-mine L2 as well and that ends up 8-10x slower.
i had an idea here which might help: a L1 cache hint which swaps over to using the next 6 LSBs for cache line lookup:
cache_row_num = MUX(Hint,
however you'd need to do a full cache flush to swap modes, so you'd better be damn sure you want to do that.
More information about the Libre-soc-dev