[Libre-soc-bugs] [Bug 1157] Implement poly1305

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sun Sep 10 21:10:02 BST 2023


https://bugs.libre-soc.org/show_bug.cgi?id=1157

--- Comment #3 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Sadoon Albader from comment #2)
> https://libre-soc.org/irclog/latest.log.html#t2023-09-10T18:37:56

ok that's not a bad implementation, look for regular patterns.

btw this will do as the unit test just drop that into the openpower-isa
repo, make sure to preserve copyright, where it came from, and the license

https://github.com/ph4r05/py-chacha20poly1305/blob/master/chacha20poly1305/poly1305.py

hubert's is LGPLv2
https://github.com/ph4r05/py-chacha20poly1305/blob/master/LICENSE

the regular patterns you can source the operands from alternate
regs, using REMAP Index and Vertical-First just like in chacha20


so, line 176
https://github.com/floodyberry/poly1305-donna/blob/e6ad6e091d30d7f4ec2d4f978be1fcfcbce72781/poly1305-donna-64.h#L176

        c = 0 # makes the pattern regular
        h1 += c      c = (h1 >> 44); h1 &= 0xfffffffffff;
        h2 += c;     c = (h2 >> 42); h2 &= 0x3ffffffffff;
        h0 += c * 5; c = (h0 >> 44); h0 &= 0xfffffffffff;
        h1 += c;     c = (h1 >> 44); h1 &= 0xfffffffffff;
        h2 += c;     c = (h2 >> 42); h2 &= 0x3ffffffffff;
        h0 += c * 5; c = (h0 >> 44); h0 &= 0xfffffffffff;
        h1 += c; # do this manually (outside the loop)

what you can do is: 

* 2 regs with consts 0xffff and 0x3fffff then REMAP Indices
  0 1 0 0 1 0
* another 2 regs with shift amounts 44 and 42 then REMAP
  Indices 0 1 0 0 1 0 oh look those are the same, use the
  exact same SVSHAPE as above
* 2 regs with a constant 1 and 5 for a mul c*1 or c*5,
  here you will need a 2nd REMAP Index 0 0 1 0 0 1
* 3 regs h0 h1 h2 use another REMAP Index 1 2 0 1 2 0

if you are very lucky you can target the same reg for the shift
as the AND, which means you don't need to do svremap inside the
loop,you can do it outside.


whatever you do, *go slowly*. even just the above is enough
to get on with, do it *as its own unit test*.  it will be:

* 5 instructions to set up Index SVSHAPEs and loop counter,
  and setvl in vertical-first mode
* a mul, an add, a shift, an AND
* svstep to decrement the counter and bc to do the VF loop
* an add for h1+=c

12 instructions, absolute max.

(mul by 5 or 1 avoids pissing about with branches, which would
damage parallelism).

strongly suggest *only* doing this, as a first incremental step.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list