[Libre-soc-dev] SVP64 Vectorised add-carry => big int add

Tue Apr 19 20:57:03 BST 2022

On April 19, 2022 7:12:05 PM UTC, Jacob Lifshay <programmerjake at gmail.com> wrote:
>On Tue, Apr 19, 2022, 04:43 lkcl <luke.leighton at gmail.com> wrote:

>> did you mean subfe here? (subfe: RT = ~RA + RB + CA)
>>
>
>yes, subfe a, b, c is sube a, c, b ... just like subf a, b, c is sub a,
>c, b

brilliant, that's really good news.  far less regs used all round.

>how would the fixup condition be detected?
>
>
>assuming the loop is for(i = 0; i <= n; i++), then fixup is simply !CY.

yay, i thought it might be.  question is, how to get that out for testing, without a ton of instructions. it would be nice instead for CY to go in (and out) of CR0, or at least a copy of CA to go into CR0.SO for example, what do you think?

>> * 2-bit EXTRA2, means 4 operands can be marked
>> * EXTRA IDX0: d:RT - RT as destination
>> * EXTRA IDX1: d:RC - RC as a destination
>> * EXTRA IDX2: s:RA - RA as a source
>> * EXTRA IDX3: s:RC - RC as a source
>>
>
>I never liked having separate EXTRA2/3 for both source and dest on the
>same
>register field...it makes register allocation a pain -- i'm very
>inclined
>to just have llvm always generate the exact same register for both.

i'd expect it to be years before full compiler support hits, and for this to be an assembler trick for pretty much forever.

>imho having RB be able to be a vector is more important, since it
>allows
>using mule for vectorized 128-bit arithmetic as a pair of 64-bit
>vectors,

good point... ya know... there *is* one spare bit in the 9-bit EXTRA Field... :)

>One other thing we'd want is a unsigned * signed version of mule
>(muluse?),
>for bigint multiplication by a signed 64-bit number,

there's only 5 slots available in EXT04 so some care is needed here.

> RB would be signed
>and
>RA and RT would be unsigned, I'd have to think about what RC would be
>to
>make carrying correctly work.

ok.

l.