[Libre-soc-dev] compressed instructions state requirements

Tue Nov 24 21:07:38 GMT 2020

On Tue, Nov 24, 2020, 09:04 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

> On Tue, Nov 24, 2020 at 5:41 AM Jacob Lifshay <programmerjake at gmail.com>
> wrote:
>
> > If the word-swapping scheme is as I described earlier and the only
> > instructions used are 32/64-bit 32-bit-aligned instructions, then they
> are
> > indistinguishable since they are the exact same bytes in the exact same
> > memory locations (assuming the standard PowerISA LE/BE etc. modes are
> > known).
>
> this - 32/64 - i can grok.
>
> > If the stream contains any 16 or 48-bit instructions, you also need
> > the twos place in the PC.
>
> are you *certain* that PC[1] is the only information needed?  i'd
> really like to see a brute-force-generated table which shows that.
> which shouldn't be difficult to generate: 4x for-loops to generate all
> possible permutations of 16/32/48/64 bit instructions.
>
> > If the stream contains any 16-bit instructions,
> > then you also need to know if the decoder should start in Standard or
> > Compressed Mode, and (assuming we include the
> > switch-to-standard-mode-for-1-instruction
> > bit) if the last instruction specified the
> > switch-to-standard-mode-for-1-instruction.
>
> ahh this is what i was expecting.  and it basically means that we are
> going to need to, just like VLE, mark 64k pages as
> "alternatively-encoded".  given that 32-bit opcodes can be embedded in
> that, this is not a big deal.
>

My idea of how 16-bit instructions work is that they should be usable
anywhere (like RVC), no special pages needed. The extra info is
conceptually part of the PC (or some decode status register). Unlike VLE,
it works fine with any combination of 32/64-bit mode and LE/BE mode, in a
way such that the bytes in memory needed to encode pre-existing
instructions are completely unchanged -- you can run pre-existing
PowerPC[64][LE] programs entirely unmodified even if the processor supports
16/48-bit instructions since the program always starts executing in
Standard Mode.

>
> my former misunderstanding of VLE was that it was exclusively
> 16-bit-only opcodes: it's not.  it's mixed standard (v2.07B) and
> 16-bit instructions.  if it was only 16-bit instructions, there would
> be a serious problem because you'd need to fill pretty much the
> entirety of the 64k page with branches into standard 32-bit
> instructions, and back.
>
> if however a mix is permitted within a "marked" 64k page (16/32/48/64)
> then the
>

Truncated sentence?

>
>
> > I renamed 32-bit mode to Standard Mode since it is the
> backward-compatible
> > mode and also supports 64-bit (ISA v3.1)
>
> caveat (1) it's in the future (because of the EULA only granting
> permission to do v3.0B)
>

We need to contact the OpenPower Foundation and get permission to implement
v3.1.

caveat (2) if designing ISA extensions we still have to take into account
> v3.1B:
>
> those caveats taken into consideration: i'm very strongly in favour of
> not supporting v3.1 prefixing, except under severe arm-twisting
> circumstances, and *definitely* as a hard-rule mutually-exclusively
> incompatible with *any* SV Prefixing.
>

I disagree: having code that's compatible with v3.1 means getting a speed
bump from better support for larger immediates (34-bits instead of 16) as
well as PC-relative addressing. This could mostly eliminate the need for a
TOC, since shared libraries can generally be assumed to be less than 8GB in
size. This should also reduce code size somewhat. Though that's all true
only once compilers catch up.

The additional benefit of being compatible with v3.1 is we can run more
programs without needing recompilation.

>
> however it *MIGHT* and i stress *MIGHT* be ok to interleave v3.1B
> instructions with 16-bit Compressed.  but definitely *NOT* with SV
> Prefixing.
>

The way I'm envisioning it, SVP64 instructions share the PowerISA v3.1
prefix encoding space with PowerISA v3.1 64-bit instructions (more than
half that space is available), SVP48 instructions use the same 48-bit
encoding space as all other 48-bit instructions (probably using primary
opcode 0) and SVP32 instructions use other 32-bit encoding space (possibly
shared using primary opcode 0). When instructions are prefixed, their
encoding is switched to use the encoding for the proper size of SVP32/48/64.

>
> the reason is very simple: the decoding of the length of an
> instruction has to be done through strict top-level analysis without
> "deep packet inspection".
>

Yup, that can be done by (in Standard Mode) decoding the primary opcode as
well as (for opcode 0) one bit of the extended opcode field (the 256 place)
for compatibility with the "Service Processor Attention" instruction, which
needs to be 32-bit. That should be sufficiently trivial to satisfy your
worries about decode issues with multi-issue.

This causes no issues with needing all 0s to be illegal, since, in Standard
Mode, the first 32-bits in memory being all 0s would be an illegal 48-bit
instruction and in Compressed Mode all 0s would be an illegal 16-bit
instruction. No need to use Primary Opcode 0 for 16-bit instructions in
order to achieve that, so I think 16-bit instructions in Standard Mode
should use Primary Opcode 5 since that is entirely unallocated, no need for
annoying workarounds to get "Service Processor Attention" to still be
32-bit since the Extended Opcode field which encodes "Service Processor
Attention" is outside of 16-bits and all the other bits are don't-cares for
"Service Processor Attention", meaning using PO 0 would require always
reading 32-bits to check -- very messy.

>
> > and 48-bit (no spec yet) instructions.
>
> TBD when we get to SV Prefixing.  remember also that we have SV-P64
> (32-bit SV prefix plus a 32-bit instruction) and we have SV-C64
> (32-bit prefix plus a 16-bit swizzle prefix plus a 16-bit Compressed)
>

I propose that we limit the maximum possible instruction length to 64-bits
(kinda like x86's 15-byte limit) allowing the encoding I described above to
be sufficient. In particular, this means 64-bit instructions can't be
further prefixed.

>
> > Compressed Mode is 16-bit only.
> >
> > >
> > > this is a really important non-rhetorical  question that i would
> > > appreciate if you could answer plainly yes or no.
> > >
> >
> > so, extra state beyond PowerISA v3.1 is needed:
> > the twos place of the PC. (needed for 16 and/or 48-bit instructions)
> > the decode mode (Standard/Compressed) -- 1 bit. (only needed for 16-bit
> > instructions)
> > the last instruction's switch-to-standard-mode-for-1-instruction bit
> (only
> > needed for 16-bit instructions with optional switch-to-standard-mode
> > -for-1-instruction).
> >
> > If we don't mind compressed instructions always having only 10-bits of
> > space available, the only extra state required is the twos place of the
> PC.
> > No compressed mode necessary.
>
> this will definitely need to be demonstrated, with a full proof
> (covering all possible permutations).
>

Yup, that's what the demo is supposed to be.

> > ah that looks really good athough it looks like it should be in
> > > https://libre-soc.org/openpower/sv/major_opcode_allocation/  (for now)
> > >
> >
> > Maybe, it might be better on its own page --
>
> yes good idea, go for it.
>

will do

Jacob