[Libre-soc-bugs] [Bug 1085] parallel prefix sum polynomial interpolation

Fri May 19 17:58:38 BST 2023

https://bugs.libre-soc.org/show_bug.cgi?id=1085

Jacob Lifshay <programmerjake at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |programmerjake at gmail.com

--- Comment #4 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #3)
> (In reply to Luke Kenneth Casson Leighton from comment #2)
> > <markos_> I've done a similar method in the past
> > <programmerjake> idk, but parallel prefix sum and parallel tree reduction
> > can be used for polynomial evaluation:
> > <programmerjake> reduce(element_wise_mul(c, prefix_sum([1] + splat(x),
> > op=mul)), op=add) ==
> > <programmerjake> c[0]+c[1]*x+c[2]*x^2+c[3]*x^3+c[4]*x^4...
> > <programmerjake> though it tends to have lower precision than using fma
> > because of rounding before adding
> 
> ok writing this out myself so that my brain absorbs it
> 
> * start with:
>     [1, x, x, x, x...]
> * perform prefix sum multiply (parallel is fine)
>     [1, x, x^2, x^3, ....]
> * take vector of c
>     [c0, c1, c2, c3, c4...]
> * straight multiply:
>     [c0, c1*x, c2*x^2, ....]
> * perform reduction
>   c0 + c1*x + c2*x^2 ...

parallel reduction is fine here too -- reduce using fadd, though as noted it
gets less accuracy than using fma (because fma doesn't round between the mul
and add and because higher powers of x tend to give monomials (includes
coefficients) that are smaller, so adding a smaller approximate value to a
larger exact value (the coefficient) gives a value with higher relative
precision.

-- 
You are receiving this mail because:
You are on the CC list for the bug.