[Libre-soc-dev] SVP64 Vectorised add-carry => big int add

Sat Apr 16 06:54:43 BST 2022

On April 16, 2022 12:26:38 AM UTC, lkcl <luke.leighton at gmail.com> wrote:

>            uint64_t v = (uint64_t)q[i] * d[j] + carry;
>            carry = v >> 32;
>            v = (uint32_t)v;

rright. ok.  i have a bit more of a handle on this.

both halves are needed, but normally in scalar mul you can do macro op fusion:

* mullo r3, r10, r11
* mulhi r4, r10, r11

==>

* OP_MULLOHI r3&4, r10, r11

when SVP64 Vectorised the element ops are split up unless actually doing the same fusion trick on the vector ops *before* putting into element execution.

question is, is it worth adding a mulx? and if so, is it worth trying to overload OE=1 on say "sv.madd" rather than add a new opcode?

(madd is RT=RA*RB+RC, maddo would be {RT,RT+1}=RA*RB+RC and sv.maddo would be {RT,RT+VL}=RA*RB+RC)

l.