[Libre-soc-dev] new svp64 page

Thu Dec 10 19:05:12 GMT 2020

On Thu, Dec 10, 2020, 10:07 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

> On 12/10/20, Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:
> > On 12/10/20, Lauri Kasanen <cand at gmx.com> wrote:
> >> On Thu, 10 Dec 2020 16:27:33 +0000
> >> Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:
> >>
> >>> lauri, jacob, what's your thoughts on using 2 bits for clamping mode?
> >>> this is *not* the same as elwidth itself, which is the "chop" in VSX
> >>> ops pseudocode.
> >>>
> >>> or: another idea:
> >>>
> >>> * extsb, extsh, extsw specify one type of width
> >>> * twin predication specifies 2 more (src elwidth, dest elwidth)
> >>> * 1 bit says "operation is to be clamped" (not to which range, that's
> >>> implicit)
> >>
> >> I can't come up with a use case for having different clamping to dst
> >> elwidth. If you want 8-bit unsigned saturation, there's no reason for
> >> you to write that to 16-bit elements. So I would take the clamp width
> >> from the dst elwidth.
> >
> > it's not that the elwidth has a reason (or not), it's that add and
> > other arith ops *don't* have sign/uns (except for mul and div) and
> > they don't have a full range of 8/16/32.
> >
> > now, if we allow dest elwidth even on 2-src *arithmetic* operations
> > (something that was left out of SVP originally because of lack of
> > space), then now the one bit "sat" (or 2 bit, one for signed one for
> > unsigned) starts to gel.
> >
> >> I would simply have two bits to enable clamping, unsigned and signed.
> >> 16 and 32 bit do need both, not just 8-bit.
> >
> > i realised belatedly that add does not have add-signed as separate
> > from add-unsigned.  nor is there, in Power, an add8 or add16.
> >
> > i will see if there's space in the 24 bits for dest elwidth and 2 bits
> > for sat mode.

> there is.
>
> does this look like a reasonable general-purpose algorithm, applicable
> to all operations, whether exts*, mr, or 2/3 arithmetic ops?
>

I don't think we need a second elwidth except for size conversion ops,
saturating ops don't need 8-bit output 16-bit input add (or other
3-argument ALU ops with different output size). For implementing average,
we could encode that by repurposing xor (or some other bitwise op) with
saturation to instead mean averaging add.

We will also want saturating mul, saturating sub, and maybe saturating
lshift.

Jacob