[Libre-soc-isa] [Bug 560] big-endian little-endian SV regfile layout idea
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Wed Dec 30 22:48:21 GMT 2020
https://bugs.libre-soc.org/show_bug.cgi?id=560
--- Comment #10 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Alexandre Oliva from comment #5)
> if you take a zero-initialized vector, and use a byte-load instruction
> svp64-prefixed with ELWIDTH=h ELWIDTH_SRC=default to load and zero-extend
> each byte of a string into a half-word, and then store the registers holding
> the vector in memory M[], should you get the string's bytes in M[odd] or
> M[even] bytes? should it not depend on endianness?
yes... by respecting the MSR LE/BE bit and respecting whether the bytereverse
version of LD/ST was called or not.
with the weird ordering-dyslexia that i have i am not really the best person to
definitively say "it should be odd or even".
however what i _can_ say is that this is a good point, needs close evaluation,
and interaction with the sign-extension, elwidth overrides and also saturation
is going to have to be thought through carefully.
i do however expect it to be straightforward and self-evident.
and, i do have to say this: *not* then massively complicating things by adding
the extra dimension of the regfile itself being allowed to byteswap will make
that evaluation one hell of a lot easier.
> now, if your maxvl is 3, which pair of consecutive bytes in memory is
> guaranteed to have zeros, M[0..1] or M[6..7]?
with VL counting from 0 and with bytes in the regfile SRAM matching that i.e
NOT MASSIVELY COMPLICATING THINGS by having regfile byteswapping the answer
needs a walkthrough
reminder: byte-load instruction svp64-prefixed with ELWIDTH=h
ELWIDTH_SRC=default
there is missing information here. we assume LE processor mode. also we
assume unit-strided Vector LD.
memory is:
0 1 2 3 4 5 6 7
NNMMOOPPQQRRSSTT..
this will be:
* load one byte (elwidth src is default)
at address RA+0
* zero-extend 8 bit to 16 bit
* data is therefore 0x00NN
* vstart=0
* hword containing data goes
into int_regfile[RT].h[0]
WE ASSUME LE SRAM ON REGFILE BECAUSE OTHERWISE IT IS TOTAL HELL (and needs a
usecase, and needs encoding space, and needs full evaluation which we don't
realistically have time for)
byte 0 of RT: NN
byte 1 of RT: 00
next, vstart=1
* load next byte (elwidth src is default)
at address RA+1
* zero-extend to 16 bit
* data is therefore 0x00MM
* vstart=1
* hword containing data goes
into int_regfile[RT].h[1]
byte 0 of RT: NN
byte 1 of RT: 00
byte 2 of RT: MM
byte 3 of RT: 00
and for vstart=2 it should be clear.
> does the answer change if maxvl is 4, and but vl is still 3?
no.
>
> what if they're both 7?
byte 0 of RT: NN
byte 1 of RT: 00
byte 2 of RT: MM
byte 3 of RT: 00
byte 4 of RT: OO
byte 5 of RT: 00
byte 6 of RT: PP
byte 7 of RT: 00
# now we have crossed over a 64 bit boundary
# into the next register, RT+1
byte 0 of RT+1: QQ
byte 1 of RT+1: 00
byte 2 of RT+1: RR
byte 3 of RT+1: 00
byte 4 of RT+1: SS
byte 5 of RT+1: 00
byte 6 of RT+1: unmodified
byte 7 of RT+1: unmodified
if that is not the desired layout, the simple solution: call a bitmanip
byteswapping opcode. vectorised of course.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list