[Libre-soc-dev] svp64
Jacob Lifshay
programmerjake at gmail.com
Tue Dec 15 04:50:51 GMT 2020
On Mon, Dec 14, 2020, 20:02 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:
> On 12/15/20, Jacob Lifshay <programmerjake at gmail.com> wrote:
> > I updated the WIP svp64 spec to add an alternative CR register naming
> > scheme that is more consistent with the integer/fp register naming
> > scheme. Vectorized instructions would count CRs in SVCR* order, the
> > initial CR register is TBD but could be CR0 or still CR6.
>
> starting from just above the callee saved (CR5) would, using vertical
> increments, mean that scalar CR0 and CR1 were not overwritten for at
> least VL=25.
>
> > I also added
> > a spec for how the SV CRs would map to SPR fields.
>
> ok yes this is needed. the mapping of naming, how the SPRs relate, is
> however still not clear. SVCR_NN_MM i am completely unable to tell
> what the relationship is to a linear numbering of CRs from 0 to 63.
>
I thought it was obvious, for all SV[F|C]R<N>_<M> registers, the N is the
upper bits in decimal and the M is the lower bits in binary, so SVR5_01 is
SV int register (5 << 2) + 0b01, and SVCR6_011 is SV cond register (6 << 3)
+ 0b011. vectorization just increments the SV register numbers (adjusted
for elwidth and subvl, of course), so a vectorized 32-bit add:
add SVR3_01, SVR6_10, SVR10_00, elwidth=32, subvl=1, mask=lt
does the following:
const size_t start_cr = (6 << 3) + 0b000; // starting at SVCR6_000
// pretend for the moment that type-punning actually works in C/C++
uint32_t *rt = (uint32_t *)®s[(3 << 2) + 0b01]; // SVR3_01
uint32_t *ra = (uint32_t *)®s[(6 << 2) + 0b10]; // SVR6_10
uint32_t *rb = (uint32_t *)®s[(10 << 2) + 0b00]; // SVR10_00
for(size_t i = 0; i < VL; i++) {
if(CRs[(start_cr + i) % 64].lt) {
rt[i] = ra[i] + rb[i];
}
}
>
> i find i am looking at a confusing table of 64 entries that tell me
> nothing about how to implement them.
>
> thus in the predication table, without that relationship clearly
> expressed, CR[i] is meaningless.
>
CR[i] is the notation used by the OpenPower spec to refer to CR field #i,
so FP instructions with Rc=1 write to CR[1] aka SVCR1_000.
>
>
> > Also, I responded to some of the things on the svp64 discussion page.
>
> and, whoops, reverted the table which i'd deliberately split, to
> indicate that the MSB should be removed, as described in the TODO that
> i had added above it.
>
> i do not think it is a good idea to allow mixing of INT and CR
> predication for Twin Predication.
>
I think it's a good idea to allow mixing them, however supporting a reduced
set of predicates to save instruction encoding bits is a valid reason to
define a smaller predicate field just for twin predication.
>
> thus there only need be one mode select bit and 1x 3 bit fields or 2x
> 3 bit dields, the mode bit applying to both.
>
> still TODO, an algorithm describing how the names are derived.
> INT_NN_MM needs an explicit formula showing how the name relates to
> the ssequential regs GPR[0..127] and the same for FPR and the 4 64 bit
> CRs.
>
The explicit formula is bit concatenation -- N is the most significant
bits, M is the least significant bits, with the intention that future
regfile expansions will tack more bits on the LSB end of M, hence why the
mnemonics use binary instead of decimal for M.
Jacob
More information about the Libre-soc-dev
mailing list