[Libre-soc-bugs] [Bug 230] Video opcode development and discussion
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Fri Dec 11 23:16:42 GMT 2020
https://bugs.libre-soc.org/show_bug.cgi?id=230
--- Comment #19 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #18)
> (In reply to Jacob Lifshay from comment #17)
> > (In reply to Luke Kenneth Casson Leighton from comment #16)
> > > SIMD ISAs have the "advantage" of being able to hard-code the mapreduce tree
> > > (or parallel accumulator algorithm) because the SIMD size is fixed and known.
> >
> > That can be done in SV as well, since MAXVL is fixed and known,
>
> in RVV, yes, MAXVL is hardcoded to the number of microarchitectural lanes.
>
> in SV, MAXVL sets the maximum length of the allocation of the regfile that
> may be used for the vector operation *and may be set to anywhere between 1
> and 64* at runtime, at any time. MAXVL=1 says "all ops are flatlined to
> scalar".
MAXVL is fixed and known at compile time, not at cpu-design-time. The register
allocator has to allocate the backing registers after all. And, no, changing
maxvl at runtime is *not* correct, since the compiler expects it to be set to a
decided-by-the-compiler constant for every instruction in the program, where it
can be a different constant for different instructions.
since the compiler knows the maxvl, it *can* generate a reduction tree at
compile time. it doesn't need to be a power-of-2.
Adding a microcoded reduce op will allow optimizing for VL which is a
runtime-variable instead of maxvl, as well as reducing extraneous moves and
reducing power. Also, a hw-level reduce op could use a 4-way add-reduce ALU,
kinda like the adder tree part of a wallace tree multiplier.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list