[libre-riscv-dev] spike-sv non-default element widths

lkcl lkcl at libre-riscv.org
Mon Oct 22 00:52:42 BST 2018


On Sun, Oct 21, 2018 at 11:55 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Sun, Oct 21, 2018 at 3:01 PM lkcl <lkcl at libre-riscv.org> wrote:
>
> > On Sun, Oct 21, 2018 at 8:53 PM Jacob Lifshay <programmerjake at gmail.com>

> > > It just depends on if it's more important to cast after the add or before
> > > the add. I recommend having it be sign/zero-extended before the add as an
> > > explicit cast instruction can be used if after-the-add is needed, whereas
> > > two are necessary if we use visa-versa.
> >
> >  right.  ok.  so this is important.  bear in mind that there's ADD.W
> > (RV64I) and also ADD.D (Rv128I)
> >
> I was treating it as

 perhaps you ended the sentence early here?

> >  jumping down to 8/16/32/64 i intended to be done by setting XLEN=32
> > (UXL), where, obviously, in doing so, that will restrict address-space
> > to 4GB as well, so care will definitely need to be taken: simply
> > avoiding the use of addressing-dependent opcodes may be sufficient.
> >
>
> I highly recommend using some method other than changing XLEN as that
> changes the whole ABI used.

 if temporary (i.e. flipping back before a function call) it wouldn't.

> LLVM does not support changing the ABI at that
> level of resolution as it assumes it's constant for, if not the entire
> program, at least for an entire module. I believe GCC is even more strict
> about changing the ABI (I know that separate programs are usually necessary
> for each architecture, where as LLVM can have them all built into one
> program).

 hmmmm....

> If we pick a different method (maybe expanding the width field by 1 bit and
> using constant sizes instead of XLEN-dependent sizes) then we wouldn't have
> the problem with needing to have vectors allocated in a certain portion of
> the address space as that's usually a giant mess.

 ok, so the implications of adding one extra bit would be that almost
the entirety of SV's CSR paradigm would need to be redone.  it was a
struggle to fit even one extra bit into the reg csr entry, once regidx
was extended to 7 bits: "packed" had to be moved to the CSR-pred
table, and now that's completely full @ 16 bits as well.

 i hadn't realised that changing XLEN would be so problematic (LLVM etc.).

 so, going back to the drawing board on what 4 bitwidths we care
about, i believe it's:

 * 8-bit
 * 16-bit
 * 32-bit
 * 64-bit

and that will actually fit into 2-bit.  128-bit int and FP i actually
don't care so much about, i don't see their value for an embedded 3D
GPU, or VPU (or for a mid-to-high-end GPU or VPU, except possibly some
specialist parallel compute purposes, for which it is doubtful that
someone would use SV, anyway).

 if you agree with that assessment, the question would be "how to
specify those 4 for when XLEN=32" and "how to specify those 4 wnen
XLEN=64".  i would argue that a reasonable thing to do would be to
have 0b00=XLEN, 0b01=(XLEN==64)?32:64, 0b10=16 and 0b11=8.  in that
way, default behaviour can always be detected when elwidth is zero,
which will make programmers' lives easier if they don't want
polymorphic bitwidths, they just set "elwidth=0".

 i notice however you're proposing something slightly different,
below, which would be worth exploring.

> >  i.e. it's *not* done by using ADDIW / ADDW, or ADDID / ADDD from
> > RV64I and RV128I respectively, as one might intuitively expect.
> >
> I do think using the w and d variants to determine the width would work as
> the code being compiled would almost always specify the element width at
> compile time and just have the element count be variable.

 ok so if i get you correctly, effectively we're missing one bit from
elwidth, and, if i am reading you correctly, you're proposing to use
the difference between the RV32I and the RV64I variants as a way to
extend the width space.  we can't realistically use the RV128I space
as i believe it overlaps with custom-2/custom-3, and hasn't been fully
spec'd yet anyway.

 so under this concept, we have RV32I and RV64I to play with.  let's
assume XLEN=64, below.

* add, addi, the elwidtths of rs and rd can all be 8/16/32/64.
zero-extension takes place.  addi, the immediate is sign-extended (or
truncated) to rs1's bitwidth.

* addw, addiw, section 4.2 of the RV spec says that "*W" always
produces 32-bit signed values,  this aspect may be more important to
preserve than extending the bitwidth in some way.

still trying to get to grips with this :)

l.



More information about the libre-riscv-dev mailing list