[Libre-soc-isa] [Bug 560] big-endian little-endian SV regfile layout idea

Mon Jan 4 15:23:11 GMT 2021

https://bugs.libre-soc.org/show_bug.cgi?id=560

--- Comment #53 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Alexandre Oliva from comment #50)
> ok, I can tell you've got the concept of significance all confused with
> position.
> 
> Significance has to do with the quantity the symbol stands for, not with its
> position.

probably.  or... *deep breath*... i did not mention that the v3.0B and v3.1B
spec use MSB0 bitnumbering.
https://en.wikipedia.org/wiki/Bit_numbering#MSB_0_bit_numbering

whereas both microwatt and libresoc internally use LSB0 numbering
https://en.wikipedia.org/wiki/Bit_numbering#LSB_0_bit_numbering

this causes code to have to do things like 31-i, 7-i, 3-i, 63-i:

https://github.com/antonblanchard/microwatt/blob/39c826aa46a9dd80a12b572373c55d6156c4df07/execute1.vhdl#L841

in the case of CRs this resulted in four months to find and fix a long-standing
bug.

> do you see now why it doesn't make sense that the conversion from BE to LE
> (or vice versa) places the MSByte in the LSByte?

deep breath: it doesn't matter if it "makes sense", it's what the actual code -
the simulator, the HDL of microwatt and the HDL of Libre-SOC - actually do.

and they all pass all unit tests in both LE/BE mode.

therefore, the thinking - against "common sense" - has to be adjusted to
IBM's way (PowerISA's way) of thinking.

which is known to be spectacularly weird, including being the only modern ISA
spec to continue to insist on using MSB0 numbering.

sigh.

now, it *could* be as simple as, "despite it not making sense you used the
wrong opcode in the example".  it could be as simple as: you used ld in the
example where you should have used ldbrx, because of the IBM/POWER-weirdness.

> memory is called endianness.  if lower-address implies lower-significance,
> and vice-versa, that's  LE; if lower-address implies higher-significance,
> and vice-versa, that's BE.

leaving aside IBM's use of MSB0: unfortunately - we have it as matter of
straight fact, from the evidence that i've presented, and from the unit tests
passing 100% in both microwatt and libresoc, that an inversion is occurring at
the byte level where you are clearly not expecting one, for the opcode named
"ld".

now, given that no problems occur in gcc, we can put forward a hypothesis that
this bizarrety has been "solved" (including in gcc) by simply using the
opposite LD/ST opcode.  ldbrx used rather than ld and vice-versa.

whether that's the case, i don't know.  however i am telling you - fact - and
the source code is *not* going to change - because the unit tests pass - fact -
in both microwatt and libresoc - that ld *does* do byte-reversal in BE mode and
ldbrx does *not* do byte-reversal in BE mode.

correspondingly (because of the XNOR): ld *does not* do byte-reversal in LE
mode and ldbrx *does* do byte-reversal in LE mode.

again:

* ld     LE: straight
* ldbrx  LE: byte-reversed
* ld     BE: byte-reversed
* ldbrx  BE: straight

these are the facts, from both codebases, both passing unit tests 100%.

it is LD and LDBRX *in LE mode* that behaves "as expected" (byte order is left
alone, according to expectations of what the opcodes "should" do) and it is LD
and LDBRX *in BE mode* that has the byte-ordering "reversed" against
"expectations" of behaviour for these opcodes.

once the significance of this has sunk in i believe it may start to make sense.
 ultimately i think it is that XNOR that is confusing you.

at the moment i cannot yet tell if you are still at the "disbelief of how
things work in Power ISA" stage.

this is very common :)

(In reply to Cesar Strauss from comment #52)
> It seems to me that, as long as:
> 
> 1) we rigorously stick to vector (SVP64, SUBVL) load, stores and operations
> on vector registers,
> 2) stick to predication to access its sub-elements,
> 3) do not use non-SVP64 instructions on register previously used as vectors
> and vice-versa,
> 4) do not change SUBVL on the same vector register

deep breath: these are things that were envisaged, from the very beginning (2
years ago) to be allowed.

otherwise we might as well have a completely separate Vector regfile, and we
lose the advantage of not having (not needing) inter-regfile conversion / mv
opcodes.

actually even if we did have a Vector regfile the problem still exists because
of the way that the union typedef works.

> Then, the "endianess of the register file", and "VL indexing direction"
> should become totally transparent (architecturally invisible). We can choose
> one mode (say LE) and stick to it.

this was initially the decision made by RISC-V RVV.  that if the "parameters"
change the contents of the Vector regfile are actually wiped out.  however
within a year multiple people explained to them that the extra overhead
involved in transferring between scalar and vector regfiles as well as the
extra cost of the vector regfile SRAM was too great for some implementors.

consequently they provided a "fit on top of FP" mode and had to define - just
as we are needing to do - a precise and exact ordering of the entire SRAM of
the regfile(s).

of course, they don't have to deal with IBM numbering *sigh*

> Just my two cents.
> 
> I do admit that, as I reread the thread, I'm still thoroughly confused.

it's why i'm very very reluctant to go messing with it.   five months to get
LD/ST right, four months to get CR operations, mtcr and mfocr right.

-- 
You are receiving this mail because:
You are on the CC list for the bug.