[Libre-soc-isa] [Bug 569] svp64 register predicates vs BE arrays of bits

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Wed Feb 9 19:04:14 GMT 2022


https://bugs.libre-soc.org/show_bug.cgi?id=569

--- Comment #16 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #12)
> consider a case where processing of data requires LSB0 bit zero of
> each element to become, ultimately, part of a predicate mask.  the most
> logical thing to do is a Vectorised CMPI operation and the vector
> of CR Field results treated directly as a predicate mask.
> 
> if however BE is involved then at least one reversal instruction is required

no reversals are required because you never touch an integer predicate, you go
from vector of ints (not a predicate) to the cmpi into a vector of cr bits (not
an integer predicate so bitreversal doesn't happen) to whatever instruction is
predicated where the cr vector is used as a predicate (not an integer predicate
so bitreversal doesn't happen)
> 
> consider also the crweird instructions which transfer between integers and
> CR fields: these too would become damaged by bit and/or byte reversal.

those instructions would bitreverse the integers in BE mode


> when constructing big integer
> math libraries it is *not* possible to sequentially store an array of 64 bit 
> numbers representing the large number then follow up with a typecast to
> an array of 32-bit: you *have* to perform word-swapping on pairs of 32bit
> numbers first to get the sequence back.

actually, it's perfectly possible to have that property in BE, simply store the
bigint in BE:

e.g. a 256-bit bigint:

0xFEDCBA9876543210FEDCBA9876543210FEDCBA9876543210FEDCBA9876543210

is stored as an array of bytes:

[
0xFE, 0xDC, 0xBA, 0x98, 0x76, 0x54, 0x32, 0x10,
0xFE, 0xDC, 0xBA, 0x98, 0x76, 0x54, 0x32, 0x10,
0xFE, 0xDC, 0xBA, 0x98, 0x76, 0x54, 0x32, 0x10,
0xFE, 0xDC, 0xBA, 0x98, 0x76, 0x54, 0x32, 0x10,
]

which can be grouped in 32-bit chunks:

[
0xFEDCBA98, 0x76543210, 0xFEDCBA98, 0x76543210,
0xFEDCBA98, 0x76543210, 0xFEDCBA98, 0x76543210,
]

> 
> given the interchangeability between predicates and data it is simply
> not safe or sane to attempt anything other than treating the regfile as
> a byte-addressable LE-ordered SRAM.

imho it is perfectly safe/sane to treat the regfile as a byte-addressable LE/BE
sram, such that the interpretation of the bytes follows memory BE/LE mode (with
the addition that the whole SRAM is byteswapped when switching between cpu
BE/LE mode to preserve 64-bit reg values for backward compatibility -- that
switching can be completely ignored at the user application level). it requires
slightly more hardware, but greatly simplifies software.
> 
> having such a dedicated property ensures that changing elwidths does not
> require such byteswapping instructions.

actually, in BE mode, changing elwidths would require separate byteswapping
instructions if we followed your regfile-is-only-LE plan. if we followed lxo
and my regfile-endian-matches-memory-endian plan, no separate byteswapping
instructions need to be added by the programmer.
> 
> i appreciate that LLVM may have made some assumptions about SIMD, but tough.
> when we have the prerequisite USD 25 million to do a decent job of adding
> SVP64 to LLVM this can be addressed, and LLVM assumptions sorted out.
> it is good to be *aware* of the limitations, because there will be no
> surprises in budgeting to sort it out.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list