[Libre-soc-isa] [Bug 1056] questions and feedback (v2) on OPF RFC ls010
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Thu Jun 1 02:12:01 BST 2023
https://bugs.libre-soc.org/show_bug.cgi?id=1056
--- Comment #44 from Paul Mackerras <paulus at ozlabs.org> ---
(In reply to Luke Kenneth Casson Leighton from comment #35)
> (In reply to Paul Mackerras from comment #30)
>
> > I think you mean sv.addi/elwidth=16 5,5,0x1122 (not 5,_0_,0x1122).
>
> ah! yes
>
> > I'll assume the 0 for RA is a typo caused by 3.27AM.
> >
> > > * then inspect (verilator) GPR(5) and read its contents
> > >
> > > is the answer you expect, regardless of LE/BE: 0x2356?
> > > or would it be
> > > * 0x2211_0000_0000_1234 (or 0x1122_0000_0000_1234) *or*
> > > * 0x0000_0000_0000_3456 due to addi being implicitly
> > > reversed-byte-order from sv.addi under BE?
> >
> > I would expect 0x1122_0000_0000_1234 in BE mode, since you have operated on
> > element 0 and elements are 16 bits wide.
>
> ahhh now *that* makes it clear. and is so far left-field of what i
> was modelling/expecting from the combinatorial explosion of possibilities
> that i couldn't possible guess it :)
>
> now, here's the thing (walk through the implications). where the LE
> element-access would be this:
>
> # assume everything LE-ordered and LSB-numbered
> gpr_width = 8 # bytrs
> num_gprs = 128 # in "upper" SV Compliancy Levels
> GPR_sram = [0x00] * gpr_width * num_gprs
> src_elbytes = src_elwidth // 8
> for i in range(VL):
> bytenum = i * src_elbytes # element offset in SRAM bytes
> ra_element_start = RA*gpr_width # vector start position
> ra_element_start += bytenum # element offset
> ra_element_end = ra_element_end + (src_elbytes-1)
> ra_src_operand = GPR_sram[ra_element_start thru ra_element_end]
>
> a BE-reversal of the underlying SRAM-access would be:
>
> # *still* assume everything LE-ordered and LSB-numbered
> gpr_width = 8 # bytrs
> num_gprs = 128 # in "upper" SV Compliancy Levels
> GPR_sram = [0x00] * gpr_width * num_gprs
> src_elbytes = src_elwidth // 8
> for i in range(VL):
> offset = i * src_elbytes # element offset in SRAM bytes
> gpr_num = offset // gpr_width # relative GPR number
> bytenum = offset % gpr_width # byte-start in GPR
> ----> bytenum = ~bytenum & 0b1111_1111 # BE-inversion
No, this isn't right. It should be
bytenum = bytenum ^ (8 - src_elbytes)
> # now finally we know the element-offset start pos
> ra_element_start = (gpr_num * gpr_width) + bytenum
> ra_element_start += RA*gpr_width # add vector start position
> ra_element_end = ra_element_end + (src_elbytes-1)
> ra_src_operand = GPR_sram[ra_element_start thru ra_element_end]
>
>
> at which point i think you'd agree that trying to explain that to
> programmers, that this is the underlying model, would be a bit much :)
>
>
> > > now the same thing with *scalar* instructions:
> > >
> > > * let us set (verilator or "addi 5,0,0x1234") the contents of GPR(5) = 0x1234
> > > * perform "addi 5,0,0x1122"
> > > * then inspect (verilator) GPR(5) and read its contents
> > >
> > > is it *still* 0x23567 regardless of LE/BE?
> >
> > It's 0x2356 regardless of LE/BE.
>
> and that discrepancy is a violation of (one of the) Orthogonality rule(s).
> when MAXVL=VL=1 the behaviour *has* to be the same (elwidth
> notwithstanding)
The behaviour clearly does depend on elwidth (even in LE mode), because the
scalar instruction writes all 64 bits of the register but the vectorized
instruction with VL=1 only writes elwidth bits.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list