[Libre-soc-bugs] [Bug 558] gcc SV intrinsics concept

Wed Jan 13 19:04:10 GMT 2021

https://bugs.libre-soc.org/show_bug.cgi?id=558

--- Comment #53 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #52)
> sigh.  i updated the svp64 appendix to describe the CRs: it's a pig.  also
> rather than use an underscore it occurred to me that a decimal place "does
> the job" in an intuitive way.  the problem is that in a Vector of CRs some
> are *not accessible* as Scalars, a predicated mv is needed.  but if we want
> something that's doable for gcc without needing a year+ of work, this is how
> it goes.

Congrats! Now, if you also apply something similar to that to int/fp registers,
then you will have implemented basically what I proposed in bug #553.

One difference is int/fp registers (but not CR fields) go like (when counting
in vector element order):
r0, r32, r64, r96,
r1, r33, r65, r97,
r2, r34, r66, r98 -- wrapping around after 4 instead of 8 registers.

for r<N> the element-order index = ((N & 0b11111) << 2) | ((N & 0b1100000) >>
5)

CR fields go like (matching what you described):
CR0, CR8, CR16, CR24, CR32, CR40, CR48, CR56,
CR1, CR9, CR17, CR25, CR33, CR41, CR49, CR57 -- wrapping around after 8
registers

for CR<N> the element-order index = ((N & 0b111) << 3) | ((N & 0b111000) >> 3)

Another difference is the special case for svp64 int/fp/cr extra2/extra3
decoding is removed -- they are changed to always decode like they currently do
for scalars:
int/fp extra2:
| R\*\_EXTRA2 | Mode   | Range     | MSB down to LSB |
|-------------|--------|-----------|-----------------|
| 00          | Scalar | `r0-r31`  | `0b00 RA`       |
| 01          | Scalar | `r32-r63` | `0b01 RA`       |
| 10          | Vector | `r0-r31`  | `0b00 RA`       |
| 11          | Vector | `r32-r63` | `0b01 RA`       |
int/fp extra3:
| R\*\_EXTRA3 | Mode   | Range      | MSB downto LSB |
|-------------|--------|------------|----------------|
| 000         | Scalar | `r0-r31`   | `0b00 RA`      |
| 001         | Scalar | `r32-r63`  | `0b01 RA`      |
| 010         | Scalar | `r64-r95`  | `0b10 RA`      |
| 011         | Scalar | `r96-r127` | `0b11 RA`      |
| 100         | Vector | `r0-r31`   | `0b00 RA`      |
| 101         | Vector | `r32-r63`  | `0b01 RA`      |
| 110         | Vector | `r64-r95`  | `0b10 RA`      |
| 111         | Vector | `r96-r127` | `0b11 RA`      |
cr extra2:
| R\*\_EXTRA2 | Mode   | 7..5  | 4..2    | 1..0    |
|-------------|--------|-------|---------|---------|
| 00          | Scalar | 0b000 | BA[4:2] | BA[1:0] |
| 01          | Scalar | 0b001 | BA[4:2] | BA[1:0] |
| 10          | Vector | 0b000 | BA[4:2] | BA[1:0] |
| 11          | Vector | 0b001 | BA[4:2] | BA[1:0] |
cr extra3:
| R\*\_EXTRA3 | Mode   | 7..5  | 4..2    | 1..0    |
|-------------|--------|-------| --------|---------|
| 000         | Scalar | 0b000 | BA[4:2] | BA[1:0] |
| 001         | Scalar | 0b001 | BA[4:2] | BA[1:0] |
| 010         | Scalar | 0b010 | BA[4:2] | BA[1:0] |
| 011         | Scalar | 0b011 | BA[4:2] | BA[1:0] |
| 100         | Vector | 0b000 | BA[4:2] | BA[1:0] |
| 101         | Vector | 0b001 | BA[4:2] | BA[1:0] |
| 110         | Vector | 0b010 | BA[4:2] | BA[1:0] |
| 111         | Vector | 0b011 | BA[4:2] | BA[1:0] |

For LE cpu byte order:
the vsx/scalar-float registers 0-31 map the lower 64-bits to f0-31 and the
upper 64-bits to f32-63.
the vsx registers 32-63 (altivec regs 0-31) map the lower 64-bits to f64-95 and
the upper 64-bits to f96-127.

-- 
You are receiving this mail because:
You are on the CC list for the bug.