[Libre-soc-bugs] [Bug 1085] parallel prefix sum polynomial interpolation

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Fri May 19 13:59:44 BST 2023


--- Comment #3 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Luke Kenneth Casson Leighton from comment #2)
> <markos_> I've done a similar method in the past
> <programmerjake> idk, but parallel prefix sum and parallel tree reduction
> can be used for polynomial evaluation:
> <programmerjake> reduce(element_wise_mul(c, prefix_sum([1] + splat(x),
> op=mul)), op=add) ==
> <programmerjake> c[0]+c[1]*x+c[2]*x^2+c[3]*x^3+c[4]*x^4...
> <programmerjake> though it tends to have lower precision than using fma
> because of rounding before adding

ok writing this out myself so that my brain absorbs it

* start with:
    [1, x, x, x, x...]
* perform prefix sum multiply (parallel is fine)
    [1, x, x^2, x^3, ....]
* take vector of c
    [c0, c1, c2, c3, c4...]
* straight multiply:
    [c0, c1*x, c2*x^2, ....]
* perform reduction
  c0 + c1*x + c2*x^2 ...

ok that is dead easy.

so now what about parallel reduction using fmac?  let us first try
mapreduce on fmac:

    sv.madd/mr res, *c, *xv, res

that looks perfectly fine, and can get accuracy from

now parallel reduction, res would have to be a vector,
but i don't believe it works, because *c (or *xv) would
end up getting multiplied in more than once. unless done
as Vertical-First.

You are receiving this mail because:
You are on the CC list for the bug.

More information about the libre-soc-bugs mailing list