[Libre-soc-dev] [RFC] SVP64 Branches (contd)

lkcl luke.leighton at gmail.com
Sun Sep 12 13:03:55 BST 2021


the number of modes available in SVP64 Branches is considerable and they have significant cross-interaction.  it's also a critical strategic instruction for High Performance Vector Supercomputing, being as it is quite literally the fundamental basis of critical inner loops.

it therefore needs extremely thorough review. this is therefore a request for help in carrying out a public review. all public comments from any source is welcome, to take place on the libre-soc-dev public mailing list.

fundamentally, as usual with SVP64, there is the "Base Scalar v3.0B" and there is the Vectorisation Prefix.  normally there is a hard rule set which prohibits SVP64 from deviating from underlying Scalar v3.0B behaviour: in this particular case however there is good reason to consider doing so.

however given how critically important it is that SVP64 *NOT* alter Scalar v3.0B in any way, or be *misconstrued* as altering Scalar v3.0B in any way  we consider SVP64 Branch-Conditional to be completely separate instructions.

a quick summary of the modes available:

* there are the usual three bits altering the base scalar v3.0B behaviour (BO[0] to BO[2])
* predication interacts closely with the Condition Test
* the opportunity to make LR only update if the branch also takes place seems to have been overlooked in Scalar v3.0B: this is added as a SVP64-only option
* Horizontal and Vertical Vector Modes slightly alter the behaviour (ALL mode is not relevant to Vertical-First)
* Horizontal ALL or ANY testing combined with BO[1] results in AND, OR, NAND or NOR of Condition Tests
* CTR Mode has four separate *additional* sub-modes
* Vector Truncation to the Branch Point is also optional.

this brings the total combined number of options to somewhere around 2^8 (256 possible behaviours) which is far beyond anything i have ever seen in any Vector Supercomputing or 3D GPU ISA of the past 50 years.

interestingly, much of this comprehensiveness is down to the fact that Scalar v3.0B Branches are themselves quite comprehensive (CTR Mode). Without CTR Mode, SVP64 Conditional Branches would be exponentially reduced in functionality and usefulness for Supercomputing purposes.

given that the Power ISA has a reputation as a long-term stable ISA we would clearly like, and expect, that to continue.

therefore proper and thorough review with proper feedback and open discussion even at an early stage is critical.

SVP64 is an *extremely* comprehensive ISA that takes considerable prior knowledge of 3D GPU and Vector Supercomputer ISAs of the past 50 years to appreciate why it is the way that it is.

* Mitch Alsup's MyISA 66000, a comparative peer, has been in draft form for a similar timeframe (over 3 years).
* the author of MRISC32 has been developing the MRISC Vector Processing ISA for over 18 months and is still catching up with modern and historic Vector Processing techniques and background.

the absolute last thing anyone needs is a last minute scramble to gain sufficient working knowledge in order to be able to assess SVP64 as part of a formal OPF ISA WG RFC.  based on how long it has taken to develop, this will be flat-out impractical.

given that SVP64 has taken over 3 years to develop (so far), working knowledge of 3D GPU ISAs such as Broadcom VideoCore IV, MALI Midgard, Vivante, AMDGPU and Intel GMA, as well as Vector Processing ISAs such as Cray, NEC SX Aurora, RVV and Mitch Alsup's MyISA 66000, are absolutely essential.

i cannot overemphasise enough therefore how critical it is that OpenPOWER Foundation Members and Power ISA Hardware Engineers be actively involved in SVP64 development.

NLnet funding is available and it is also possible to apply to StandICT.eu for additional Horizon 2023 grant funding.


More information about the Libre-soc-dev mailing list