[Libre-soc-dev] svp64 questions: various
oliva at gnu.org
Sat Dec 26 12:04:05 GMT 2020
Tagging as vectors: is that a property of the register, or of the insn?
Consider a vector-prefixed insn:
add rD, rA, rB
Can rA and rB be both used as vectors, or both as scalars, or one as
vector and the other as scalar?
The destination, if not vector, could imply identity, reduce, or
first-predicate-that-holds, so that appears to be a property of the insn
rather than of the register.
But if one of the source registers is [r]0, then I presume this zero
would be used as a scalar rather than as a vector, which would make this
a property of the register rather than of the vector.
However, I expected to find encodings (modes?) to tell whether each
operand is scalar or vector, but aside from reduce mode, I haven't found
any such thing. So, where does 'isvec' come from?
I'm a little concerned about vectors that encompass special-purpose
registers, particularly r30 (sometimes used as PIC base register) and
r31 (frame pointer). These inherited register assignments seem to make
trouble for us, but trying to avoid them is probably not reasonable nor
worth the effort. I haven't checked whether they're mandated by ABIs,
or just GCC conventions, but frame pointers may be specified as an
unreliable means to build backtraces (debug info provides more reliable
means for that, that don't require fixed registers), and PIC base
registers may have to be set up for calls to/from dynamic libraries to
work, since the procedure linkage table may have to use it.
Given that SVP64 expands the register files to 128 or, in the future,
256 registers, how does this fit in with the goal set for our
pre-decoder to issue exclusively pre-existing ppc64 insns? It's not
like it would be able to address so many registers in insn fields that
can only address 32 registers. Have we given up that goal? Or was that
for compression only?
This also becomes a concern if the register file wraps around from r127
(or r255) to r0, since there are various dedicated low-numbered
registers, including r0 for zero and r1 for the stack pointer. It's not
clear to me that it does wrap around, though. As I read about element
width overrides, there's no mention of it and the complications it could
bring to the sub-element indexing. OTOH, not wrapping around seems to
make the higher-numbered registers far less likely to be used.
It seems to me that, when using twin predication, one of the predicate
vectors may run out before the other, and it would be useful to be able
to tell how far they got.
I saw email, but I don't see much about extending the CR register file,
or of prefix bits to reference the extended registers. Anyway, there
are plenty of opcodes that set specific CRs as side effects, and it's
not clear whether those CRs are treated as scalar or vector registers.
Assuming CR vectors are indeed available, it seems to me that it would
be useful to be able to "compress" CR vectors into predicate registers,
i.e., selecting the relevant compare result bit from each CR and placing
it in the corresponding bit in a scalar register, to eventually be used
as a predicate. There doesn't seem to be any way to do this, is there?
If so, here's an idea: have a mode that modifies the predicate register,
zeroing those bits that, in pred-result mode, either get the store
canceled (condition not met), or those that perform the store (operation
already performed on this element). With this, cr logical ops can be
used to transfer select bits in CR vectors to predicate registers, that
can then be further operated on with bitwise opcodes.
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
More information about the Libre-soc-dev