[Libre-soc-isa] [Bug 529] scheme for supporting 16/48-bit instructions on PowerPC LE with full backward compatibility

Thu Nov 12 13:55:09 GMT 2020

https://bugs.libre-soc.org/show_bug.cgi?id=529

--- Comment #6 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #5)
> (In reply to Luke Kenneth Casson Leighton from comment #4)
> > (In reply to Luke Kenneth Casson Leighton from comment #3)
> > > https://libre-soc.org/openpower/sv/major_opcode_allocation/
> 
> The explanation on the wiki page seems quite a bit less general (limiting
> alignment) than what I was envisioning:

yes i misunderstood (but accidentally came up with an alternative, which
is slightly more complex i.e. involves a queue and needs to be able to
"take" from the 1st 2 entries rather than always take from the front)

> I was thinking of conceptually the instruction stream would just be a stream
> of aligned 16-bit chunks which are decoded into *totally-unaligned*
> 16/32/48/64-bit instructions by combining 1/2/3/4 chunks in the conceptual
> sequence. 

as 2/1/4/3/6/5 order (in 16-bit chunks).

> All different instruction sizes can be arbitrarily interleaved.
> 
> The only twists are:
> - that the 16-bit chunks are laid out oddly in LE mode for backward
> compatibility.
> - that jumps/branches/returns/calls can only branch to 32-bit aligned
> addresses, so the branch targets need to be aligned by either using a larger
> equivalent instruction (preferred) or inserting NOPs.

yeah we're not going to modify PowerISA to add the extra bit to target
jumps at the 16-bit level.  bit of a pain.

> interrupt/exception
> returns *can* branch to 16-bit aligned addresses, however, since that's
> needed for preemptive context switching.

as long as the full CIA/NIA is stored (and restored), yes.

> it would require some additional thought, but I think it probably would.

ultimately though it comes down to which takes more gates.  implicit
hword-swapping (hidden from the actual instruction decoder) seems a lot
simpler.

> We would need to decide what to do for PC-relative instructions, do we
> include the second-from-lsb in the visible PC or not?

urr that's a wrinkle.  ok p37 v3.0B "branch" pseudocode:

    if AA then NIA  <-iea EXTS(LI || 0b00)
    else       NIA  <-iea CIA + EXTS(LI || 0b00)
    if LK then LR <-iea  CIA + 4

the assumption is always that the CIA is word-aligned.  that means that
any computations, if they start from a non-word-aligned point, will stay
at a non-word-aligned point.

nuts.

ok so here's two options:

* all 32-bit branches (and SV-P48/64 ones) start at word-aligned boundaries
  OR
  that they're *assumed* to start at the word-aligned boundary

and then:

* that we design some 16-bit instructions which can be hword-aligned

however, the calculation of LR is definitely no longer "CIA+4", it's going
to be "CIA+len(instruction)" which is variable.

so for example, b/ba/bl/bla would become:

    if AA then NIA  <-iea EXTS(LI || 0b00)
    else       NIA  <-iea CIA + EXTS(LI || 0b00)
    NIA[0:2] = 0b00 # set 2 LSBs to zero
    if LK then LR <-iea  CIA + len(current_instruction)

this shouuuld be ok... and 16-bit branch instructions, although there
will be far less space for an immediate, would be hword-aligned, taking
care of being able to jump at 16-bit granularity.

TAR (ignoring the 2 LSBs) also needs to be evaluated (ignore only 1 LSB?)
however we'd need to find out if the 2 LSBs are actually used by any
compilers (p32 2.3.4)

-- 
You are receiving this mail because:
You are on the CC list for the bug.