[Libre-soc-bugs] [Bug 1044] SVP64 implementation of pow(x,y,z)
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Thu Sep 14 08:19:32 BST 2023
https://bugs.libre-soc.org/show_bug.cgi?id=1044
--- Comment #11 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #10)
> https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;
> h=e044403f4c243b3248df0872a661756b6cc8a984
>
> Author: Jacob Lifshay <programmerjake at gmail.com>
> Date: Wed Sep 13 23:24:12 2023 -0700
>
> add SVP64 256x256->512-bit multiply
>
> https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;
> h=ec1196cb3abedd28492be29546e52959dee1a030
all of these can go in a vertical-first loop, using
predication 0b1000100010001000 and sv.addi/pm=nn a,b,c
to "pretend" there is a pair of nested loops.
30 "sv.adde *7, *7, *21",
31 "addi 24, 0, 0",
32 "sv.maddedu *20, *12, 19, 24", # final partial-product a * b[3]
33 "addc 7, 7, 20",
34 "sv.adde *8, *8, *21",
i.e. sv.addi with scalars does exactly the same thing as the
non-prefixed equivalent except you get predication and elwidth
overrides
the extra sv.adde can add a zero'd out source, making the first
iteration orthogonal with all other iterations and consequently
the VF loop can be applied.
and you can use Indexed REMAP on the sv.maddubrs and sv.adde.
only pain being, you have to enable/disable svremap before each
so i suggest using "non-persistent" svremap
also same as Andrey, all files require a copyright notice, you
know this https://bugs.libre-soc.org/show_bug.cgi?id=1126#c13
> Author: Jacob Lifshay <programmerjake at gmail.com>
> Date: Wed Sep 13 23:22:33 2023 -0700
>
> generalize assemble() fn so other test cases can easily use it
like it.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list