[Libre-soc-isa] [Bug 560] big-endian little-endian SV regfile layout idea

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Wed Dec 30 22:48:21 GMT 2020


https://bugs.libre-soc.org/show_bug.cgi?id=560

--- Comment #10 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Alexandre Oliva from comment #5)

> if you take a zero-initialized vector, and use a byte-load instruction
> svp64-prefixed with ELWIDTH=h ELWIDTH_SRC=default to load and zero-extend
> each byte of a string into a half-word, and then store the registers holding
> the vector in memory M[], should you get the string's bytes in M[odd] or
> M[even] bytes?  should it not depend on endianness?

yes... by respecting the MSR LE/BE bit and respecting whether the bytereverse
version of LD/ST was called or not.

with the weird ordering-dyslexia that i have i am not really the best person to
definitively say "it should be odd or even".

however what i _can_ say is that this is a good point, needs close evaluation,
and interaction with the sign-extension, elwidth overrides and also saturation
is going to have to be thought through carefully.

i do however expect it to be straightforward and self-evident.

and, i do have to say this: *not* then massively complicating things by adding
the extra dimension of the regfile itself being allowed to byteswap will make
that evaluation one hell of a lot easier.


> now, if your maxvl is 3, which pair of consecutive bytes in memory is
> guaranteed to have zeros, M[0..1] or M[6..7]?

with VL counting from 0 and with bytes in the regfile SRAM matching that i.e 
NOT MASSIVELY COMPLICATING THINGS by having regfile byteswapping the answer
needs a walkthrough

reminder: byte-load instruction svp64-prefixed with ELWIDTH=h
ELWIDTH_SRC=default 

there is missing information here.  we assume LE processor mode.  also we
assume unit-strided Vector LD.

memory is:

0 1 2 3 4 5 6 7
NNMMOOPPQQRRSSTT.. 

this will be:

* load one byte (elwidth src is default)
  at address RA+0
* zero-extend 8 bit to 16 bit
* data is therefore 0x00NN
* vstart=0
* hword containing data goes
  into int_regfile[RT].h[0]

WE ASSUME LE SRAM ON REGFILE BECAUSE OTHERWISE IT IS TOTAL HELL (and needs a
usecase, and needs encoding space, and needs full evaluation which we don't
realistically have time for)

byte 0 of RT: NN
byte 1 of RT: 00

next, vstart=1

* load next byte (elwidth src is default)
  at address RA+1
* zero-extend to 16 bit
* data is therefore 0x00MM
* vstart=1
* hword containing data goes
  into int_regfile[RT].h[1]

byte 0 of RT: NN
byte 1 of RT: 00
byte 2 of RT: MM
byte 3 of RT: 00

and for vstart=2 it should be clear.


> does the answer change if maxvl is 4, and but vl is still 3?

no.

> 
> what if they're both 7?

byte 0 of RT: NN
byte 1 of RT: 00
byte 2 of RT: MM
byte 3 of RT: 00
byte 4 of RT: OO
byte 5 of RT: 00
byte 6 of RT: PP
byte 7 of RT: 00
# now we have crossed over a 64 bit boundary
# into the next register, RT+1
byte 0 of RT+1: QQ
byte 1 of RT+1: 00
byte 2 of RT+1: RR
byte 3 of RT+1: 00
byte 4 of RT+1: SS
byte 5 of RT+1: 00
byte 6 of RT+1: unmodified
byte 7 of RT+1: unmodified

if that is not the desired layout, the simple solution: call a bitmanip
byteswapping opcode.  vectorised of course.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list