[Libre-soc-dev] OPF ISA WG External RFC

Luke Kenneth Casson Leighton lkcl at lkcl.net
Thu Sep 1 10:01:14 BST 2022

[Caveat for this discussion, we are under time pressure to
complete NLnet tasks before mid-October! those take priority]

On Thursday, September 1, 2022, Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> https://openpower.foundation/isarfc/
> this is where we submit external RFCs, hooray

so, the next task (for which the large NLnet Grant
has been submitted) is to develop an *overview* Formal
Proposal (the middle radiobutton Toshaan mentioned)

https://libre-soc.org/nlnet_2022_opf_isa_wg/ being reviewed by NLnet

note that there is therefore NO FUNDS - yet - for development
below and not much time available either. this is a placeholder
discussion which needs to be tackled later, i am note-taking
from various discussions of the past few days.

basically the first step is to give the IBM Architect team
who are responsible for Opcode and SPR Allocation a *full*
advance notice of literally every instruction we have designed
so that:

1. they know what to expect
2. *we* don't screw up.

both those are equally important!

the biggest table by far is this one:

but it is *not the whole story* because the GF ops have
not been allocated, nor rounding converts, nor 3D textures,
and additionally there are the Transcendentals

fortunately the Ts can fit into EXT063 and EXT059 naturally.

this leaves the *TWO* major opcodes needed for bitmanip,
there are 4 instructions needing a 2-bit XO (one of those
needs an Rc=1) and then there are *seven* needing a 3-bit
XO.  then there are several needing 5-6 bit XOs and a batch
that are happy with 10-bit (X-Form).


* 2-bit XO ternlogi, crternlogi, grevluti, SVP64
* 3-bit XO: bmrevi, xpermi, binlut, swizzle mv/fmv etc.
* 5-6-bit XO: setvl, svstep etc, bmask, fmvis/fishmv,
             bincrflut, etc.
* 10-bit XO: cprop, cldiv clmul etc., avgadd etc. xperm
            bmat*, shadd*

haven't even found space for GF* yet (except cl*)
nor for the int_fp_mv rounding ops nor 3D texture
(rounding needs redesign to make them look like
far less ops by having the rounding mode as an

turns out that EXT006 might be a good place for the 2-bit
XO ops and that if you look carefully at EXT019 there is
1/2 of the EXT019 Quadrants unused (2/4 Quadrant pages
entirely empty) but that is seriously pushing our luck, i feel.

if EXT001 is not available then options are:

* either SVP64 or 2-bit XOs go into EXT006 whilst the
 other goes into EXT009
* 3-bit XOs either get distributed across EXT019 and
 a few other places or they get allocated EXT005
* 5-6 bit XOs in EXT031 and EXT019 taking up entire columns
 (exactly like addpcis does) or in EXT005
* 10-bit XOs, usual job, across EXT031 or EXT019 or EXT005

We also need to ensure that there is room for both the
future SVP64-Single proposal (prioritises register access
numbering as scalar ONLY instructions) which will involve
a LARGE allocation for extending regnums with EXTRA3 or
even EXTRA4 particularly of concern there is CR 3-bit
(BFA) because an extra FOUR bits is needed to get up
to cr0-cr127 and also predicate masks and elwidth overrides

SVP64-Single is therefore highly likely to need 24-bit
on its own thus it is its own 2-bit XO and would fit
in EXT006 next to SVP64

also needed is future SVP64 expansion and that ends up in
total 75% of EXT006. SVP64 SINGLE FUTURE lxvq/stxvq

basically it does all fit, but leaves almost no room anywhere
else in the 32-bit space. the idea of 64-bit ops has been
rejected because these all need to be SVP64 prefixed which
is 96-bit (!) so a hard NO on that one.

therefore we are in effect proposing taking up 75% *EACH*
of *THREE* major opcodes and the justification for this
has to be absolutely rock frickin solid.

Surprisingly this is pretty easy and straightforward, in
each case, because each op is justifiable on its own merits.
the issue that is "alarming" is simply that there hasn't
been anyone else putting in *anything* for Power ISA outside
of IBM for 12+ years and we are stepping in to rapidly catch
Power back up to general-purpose compute with ARM AMD Intel.


crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

More information about the Libre-soc-dev mailing list