[Libre-soc-bugs] [Bug 1044] SVP64 implementation of pow(x,y,z)

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Mon Oct 9 08:59:17 BST 2023


--- Comment #38 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #36)
> the purpose of the exercise here is not "get it done".
> the purpose of the entire grant is to be *efficient* in the
> ISA, to prove tht it is efficient, and if it is not then
> *modify* the ISA to *make* it efficient.

well, vertical-first is probably not beneficial because the algorithm needs to
run a bunch of operations at different rates (1-8 mul/sub for every div,
because the mul/sub are nested inside another loop and the div is not). svp64
works well when all of the operations can be done at the same rate, and works
less well when the rates wildly differ because you need to jump through
predication hoops to get it to work. scalar powerisa instructions are quite
efficient for the operations that aren't uniform and aren't the highest rate.
if you try to cram it all into a vertical first loop, it may be possible but
won't be pretty and probably won't even be faster than scalar ops and
horizontal-first loops partially because of all the complex setup code and
standard OoO cpus already being able to quite efficiently handle those scalar
and hf-svp64 data flow patterns.
> if you have not listened to the primary designer of the ISA
> who knows all of the tricks that make algoriths efficient,

convoluted and squished into one vertical-first loop with lots of remap isn't
the same thing as efficiency, being vertical-first may make it *less*
efficient, even if we add new remap modes.

How about this:
we write *both* a horizontal-first (since I find that straightforward and
relatively obvious) and later attempt a vertical-first version? we can then
compare them and see wether all the extra complexity is beneficial.

You are receiving this mail because:
You are on the CC list for the bug.

More information about the libre-soc-bugs mailing list