[Libre-soc-dev] efficient decoding algorithm for variable-length instructions

Thu Nov 26 05:07:56 GMT 2020

On Wed, Nov 25, 2020 at 8:27 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>
> jacob the imtermingling of 1st and 2nd level decoding (xo test) is
> something that specifically cannot be done.  the toplevel FSM on the
> raw insn stream has to be kept as absolute-basic as possible.

Look closely, the code is checking only 1 more bit, not piles.
Basically, it's another bit like the primary opcode, just somewhere
else.

> in addition just as with how SVP64 must not be intermingled with v3.1B
> 64 bit, SVP48 11 bit prefixes must not be intermingled with attn.
>
> to try to go "err all 11 bit prefix combinations are permitted oh
> except err if it happens to be the exact same encoding as if an attn
> instruction was overlapping us" is asking for trouble.

I don't think it's that much more complex to decode (2-3 gates per
16-bits at most), and if we do combine them, then we get the
additional advantage of compatibility with PowerISA v3.x without
needing to switch to a special "GPU mode". Also, everyone else gets
the advantage that they can use the 16/32/48/64-bit scalar
instructions that we defined without needing to implement the GPU Mode
which is too big for them to practically implement in a non-GPU.

> and, just as with SVP64 if IBM decides to add extra instructions
> beyond attn to the EXT000 space we're screwed.

That's what the OpenPower ISA Workgroup is for, so everyone knows what
instruction encodings everyone else is proposing to avoid
double-allocating. This is part of

> best nip that in the bud and say "nope, attn is eliminated and moved elsewhere".

We *can* do that in GPU Mode, but, unless also supported in CPU mode,
then all the scalar code and/or non-Libre-SOC CPUs can't benefit from
the new compressed instructions. Results: additional power and
performance wasted because we decided that people only need that if
their CPU/task is GPU enough, even though it's readily evident from
RVC that CPU tasks can and do benefit from compressed instructions.

Jacob