[Libre-soc-dev] svp64 review and "FlexiVec" alternative
Jacob Lifshay
programmerjake at gmail.com
Wed Aug 3 06:56:39 BST 2022
On Tue, Aug 2, 2022 at 9:53 PM Jacob Bachmeyer via Libre-soc-dev
<libre-soc-dev at lists.libre-soc.org> wrote:
>
> lkcl wrote:
> > i have a feeling that Mitch worked out how to do it. FMAC
> > having in effect a Scalar accumulator (src==dest) whilst
> > other operands get tagged as vectors, HW can detect that and
> > go "ah HA! what you *actually* want here is a horizontal
> > sum, let me just microcode that for you".
> >
>
> Well, now that I think about it, yes, FlexiVec *can* express a
> horizontal sum by accumulating into a scalar register. Hardware
> recognizes this very simply: an ADD targeting a scalar register RX,
> using that same RX and a vector register RY. This will also work with
> the null implementation.
Do note that this trick only works well for integer add, floating
point add is not associative so must be run serially (assuming the
semantics are equivalent to running the code serially from element 0
to the end). SVP64 specifically has an O(log N) parallel tree
reduction mode to work around that.
Jacob
More information about the Libre-soc-dev
mailing list