[Libre-soc-dev] next ISA WG RFC

Luke Kenneth Casson Leighton lkcl at lkcl.net
Mon Mar 6 18:45:26 GMT 2023

On Monday, March 6, 2023, Jacob Lifshay <programmerjake at gmail.com> wrote:
> On Mon, Mar 6, 2023, 04:10 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
>> folks we need to discuss what RFCs should go in next, and plan
>> groupings
>> https://libre-soc.org/openpower/sv/bitmanip/
>> my recommendation is to not go above about 5-7 instructions
>> per RFC, and to group them.  candidates:
>> * ternlogi, crternlogi, binlut, crbinlut
> these are a good choice to submit next with mostly obvious benefit,
though we might be able to squeeze an extra bit out of ternlogi's immediate
by deleting the redundant encodings already covered by li, and, or, xor,
mv, etc. it seems worth trying and seeing how complex that would be. we can
also just decide redundancy is ok and simplicity is worth the extra
encoding bit. maybe that should be an unresolved question that the ISA WG
can answer.

no.  the Power ISA decoder is ridiculously complex as it is.
POWER9 has a 2 stage decoder which is ridiculous

>> * average-sum-diff and abs-accumulate, useful for AV
> pretty good, but imho ternlog is more compelling since av insns already
exist in vsx

except we're not doing VSX. the case for adding them as scalar
is based on SVP64, these being a stepping stone.

> and, without vectorization of some sort, are not very beneficial

hence why SVP64 was put in as the very frst RFC.

>> * grevlut, xperm, bitmatrix
> imho grevlut still has the major problem of using a huge amount of
encoding space for not much benefit, i think it can be greatly simplified
while retaining nearly all the practical benefit,

you tried once already and dramatically reduced the capability
(to a fraction of grevlut) which tells me that rather than the
instruction being "not much benefit" you don't quite understand
how powerful it is.

that said because it is so innovative and new there simply
hasn't been any analysis done, no use-cases except that gorc
grev etc can be covered by it (like ternlog covers crand etc),
this will itself make it difficult to justify inclusion,
until that research is done.  good thing there's an NLnet Grant
milestone for exavtly that, eh? :)

> therefore imho it's not ready for submission. also, it needs grev and
bitrev and similar aliases

yep. as assembler-aliases.  they're all there. gorc, gxorc, grev.
just have to find them.  i think this is the one that generates
over a thousand regular-patterned constants. can't remember.
been too long since i wrote it.

>> * bmask (x86 BMI on steroids) and cprop (carry-propagation)
>> * bitmask ops (or/and/xor/get) actually shift operations
> aren't those just `crand` or `and` and similar? i'm guessing that's not
what you meant, so links please.

bmset, etc in the bitmanip page.
bmask is on the vector_ops page. all "vector" ops got mashed
out, leaving "support" routines like cprop and generalisation
of set-before-first etc.

>> * crweird operations (powerful interchange between GPR and CR)
>> * carryless mul/div/mod
> these are basically good to go, though imho are less critical so can be
left for later when we need a break from more complex stuff.

ack, good assessment.

>> * int/fp mv and mv.swizzle/fmv.swizzle
> imho int/fp mv/convert should be its own separate rfc without swizzle.

ack. yes they are different.

>  imho int/fp is basically ready,

ah i forgot, it's not ready.

> trying to smash it into fewer opcodes is imho a fool's errand

please don't denigrate rational arguments in this way.

> because it just doesn't fit, uses the same amount of encoding space, and
makes it harder to understand.

irrelevant. i have gone over this already am am not repeating
it. reducing the number of "actual" instructions is critical
and the absolute top priority.  ternlogi is not submitted
as 256 instructions. please review tom forsyth's video on

please can you adjust the spec page to account for that
otherwise i am forced to do it

> also imho we might want to do swizzles after submitting at least basic
svp64 subvl support. also imho swizzle might not be ready for submission,
icr reviewing it in detail, so it may need some design tweaks.
>> transcendentals and the GF groups are a bit big to tackle at
>> the moment.
>> the most obvious priority ones (easiest to justify) would
>> be the AV ones. there exist already VSX variants.
>> thoughts?
> in summary imho we should next submit int/fp mv/convert (no swizzle for
now) or ternlog & friends.
> Jacob

crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

More information about the Libre-soc-dev mailing list