[Libre-soc-bugs] [Bug 1155] O(n^2) multiplication REMAP mode(s)

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sat Dec 23 17:23:05 GMT 2023


https://bugs.libre-soc.org/show_bug.cgi?id=1155

--- Comment #23 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #18)
> (In reply to Luke Kenneth Casson Leighton from comment #16)
> > oo hang on, i have my head round this now :) so i have a question:
> > at the cross-over point to move to the next row (the outer loop)
> > is it *guaranteed* that ca=0 such that it will not interfere with
> > the new row?
> 
> yes afaict. I did an exhaustive check for word sizes of 2 bits and 3-word by
> 3-word multiplication (code below), so this formally proves that case. I
> have no reason to think that doesn't extend to more words and/or bigger
> words, so it should work.

i know that the multiply itself, even with add-64-bit, cannot exceed
the boundary of 128.

and therefore we know also that an N-mul (N=256,512,etc) cannot exceed
the boundary of N*2 bits

and therefore i believe it reasonable to assume that an Nx64 (N mod 64 == 0)
multiply would also not exceed the boundary of.. err.... N+1?

which means that the last digit (N+1) of that add is never going to carry.

therefore, logically, if NxM (N%64==0 and M%64==0) is constructed from
a sequence of M//64 adds, each of length N+1 but shifted over (long-mul
style) then the last in each of those N+1 things is never going to involve
a carry...

but there *is* a mul-and-add involved, and i recall that even a mul-and-add
cannot produce a carry.

* 0xffff * 0xffff              = 0xfffe0001
* 0xffff * 0xffff + 0xffff     = 0xffff0000
* 0xffff * 0xffff + 0xffff + 1 = 0xffff0001

so even if you have mul-and-add-*and-carry* it still never exceeds 2x the bits.

because if you do (0xffff+1) * (0xffff+1) it is a^2 + 2ab + b^2
compared to 0xffff*0xffff and lay it out as a 2D square (a on x-axis
b on y-axis) only the bottom-left quadrant is full but the other
3 quadrants (representing the ab + ab + b^2) are entirely empty.

   a*b=0x0000ffff b*b=         1
   a^2=0xfffe0001 b*a=0x0000ffff

vs

     ........   ........
   0xfffe0001   ........

* b=1 therefore b^2 = 1
* a=0xffff therefore a^2 you have 0xfffe0001

but you have to have *two* lots of ab (0xffff) to reach 0x10000*0x10000
which as long as you do not try to do mul-add with 2 x 0xffff you are
*guaranteed* never to carry-over on a madd operation.

with maddedu only doing RC=0xffff, and not even taking a CA flag,
we're good.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list