[Libre-soc-isa] [Bug 1056] questions and feedback (v2) on OPF RFC ls010
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Wed May 31 15:23:56 BST 2023
https://bugs.libre-soc.org/show_bug.cgi?id=1056
--- Comment #36 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #35)
> (In reply to Paul Mackerras from comment #30)
> > In fact, depending on how elwidth affects loads and stores, there may be
> > another answer to my original concern about loading an array of values into
> > registers. It's possible that doing sv.ld/elwidth=16 r3,0(r4) with VL=4 will
> > load four 16-bit elements into r3 in the right order for future operations,
> > but I don't know for sure.
i missed that that is ld and not lhz. if it's sv.lhz/elwid=16 *r3, 0(r4) with
VL=4 then that loads a contiguous array of 4 16-bit elements in LE/BE as
appropriate into r3 in LE order as needed for svp64. e.g. in BE mode:
mem at *r4:
0x01 0x23 0x45 0x67 0x89 0xab 0xcd 0xef
result in r3 (in LE, since that's always what svp64 uses for registers):
0xcdef_89ab_4567_0123
because it treats it as loading uint16_t[4].
the load uses unitstride mode which is afaict what we want here.
> elif svctx.ldstmode == elementstride:
> # element stride mode
> srcbase = ireg[RA]
> offs = i * immed # we want this one
no, afaict.
> elif svctx.ldstmode == unitstride:
> # unit stride mode
> srcbase = ireg[RA]
> offs = immed + (i * op_width) # we don't want this one
we *do* want this one afaict.
>
> so, to match the english-language words you use with the assembler,
> you wanted:
>
> sv.lh/ew=16/els r3,16(r4)
>
> which will load QTY4 16-bit contiguous elements starting at r4,
> and drop them (also contiguously) into r3.
no it won't, it will load a 16-bit value every 16 *bytes* starting at r4.
lkcl, you were probably thinking of sv.lhz/elwid=16/els *r3, 2(r4) since u16 is
2 bytes
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list