[Libre-soc-isa] [Bug 560] big-endian little-endian SV regfile layout idea

Mon Jan 4 05:42:09 GMT 2021

https://bugs.libre-soc.org/show_bug.cgi?id=560

--- Comment #40 from Alexandre Oliva <oliva at libre-soc.org> ---
the ld instructions in the example don't request byte-reversal.

in big-endian mode, the lowest-address byte of a dword (the one holding the 1)
is the MSByte, and the highest-address byte of that dword (the one holding the
8) is the LSByte: the in-memory-order array {1,2,3,4,5,6,7,8}, if read as a
dword, comes out as 0x0102030405060708

without an explicit request for byte-reversal, the ld instruction will keep the
MSByte in the MSByte, and the LSByte in the LSByte, even though it seems to
reorder the bytes because of the way we represent the register.

our vector insns iterate from LS to MS elements.

therefore, in BE, our vector insn will visit the 8 as the first byte element,
and the 1 as the 8th byte element.

can you point at any error in this proof?

yes, if you use an alternate instruction to load the vector (e.g., load with
byte swapping, or byte vector load), you get different results, but if you're
answering a question other than the one I asked when seemingly answering the
question I asked, you're not clarifying, you're making for confusion, because
what I expect is an answer to the question I posed, not to something else that
you'd rather answer.

now, why is that so freaking important, you might be asking yourself.

it's because there are general expectations of how data is laid out.

when we state the vector can be indexed like an array, it's taken as meaning
that neighboring vector elements are at adjacent addresses in memory, and that
lower-indexed ones are at lower addresses.

if I tell the compiler that I have an array of 8 bytes and initialize it in
memory as {1,2,3,4,5,6,7,8}, it will lay them out this way.

the expectations about loading vectors from memory, and storing them in memory,
is also supposed to abide by these normal layout assumptions.

when you index arrays in vector registers, there's still an expectation that
the indexing gets you the same elements you'd get in memory order.

but what you're saying is that, in our system, if we do what is generally
expected to do the job, you'll get BE vector indexing backwards on a
per-load-unit basis.

I say per-load-unit because if you load the vector one byte element at a time,
you preserve the indexing in the vector register.  if you load a half-word at a
time, you get odd and even bytes swapped.  if you load a word at a time, you
get reversed indexing within each word.  and so on.

can that be worked around?  quite likely.  the compiler will have to figure out
how the units get reordered depending on the way the register was set up, and
adjust the indexing to match.  the problem with that is that there are plenty
of operations that are supposed to be no-ops, but that in this scheme won't be,
so the compiler might quietly optimize out that which would fix the mess. 
that's what one gets for breaking generally-held assumptions.

now, the other way to go about this is to never ever allow mode changes in
vectors: you load the vector from memory for use in a certain way, you use it
that way only, and if you wish to use it a different way, you store it and load
it back the other way.  and cross your fingers that the compiler won't optimize
that out too, because the store and load back also seem like a nop from the
general layout model in the compiler.

I see a lot of risk there, risk that could be averted by using an
endianness-compatible iteration order within vector registers, that would match
(corresponding) memory order.  that would enable us to bundle memory ops into
dwords, even in BE mode, and have the vector registers hold data in the
expected significance order.

without that, departing from the normal conventions, you may think you're
avoiding trouble of working out endianness, but instead you're pushing that
trouble onto every other upper layer, by turning what is a well-understood
model (for those who've worked it out) into something that's different enough
to make trouble but not enough to disable all learned intuitions, so it will
lead to errors.

-- 
You are receiving this mail because:
You are on the CC list for the bug.