[Libre-soc-bugs] [Bug 238] POWER Compressed Formal Standard writeup

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Tue Nov 17 21:52:46 GMT 2020


--- Comment #25 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
oooo i just had an awesome idea.  one of the problems we have with trying to
use an existing scalar ISA for GPU is: 32 bit opcodes just don't fit.  zero
space for swizzles etc.

i just realised that after compression, SV-C48 or better SV-C64 leaves enough
space to add 12 bits for full swizzle selection.

how did that work again? a mask is needed to say which 4 from a vec4 are to be
selected (0110 says YZ to be used out of XYZW) then 2-2-2-2 are the indices to
specify the order?

that's 8 per reg though.  which means 4+8+8 bits needed to specify mask plus
src1-shuffle plus src2-shuffle.

however at least 12 bits would apply to a mv.

or, is it normal to apply the same swizzle to src1 and src2?

You are receiving this mail because:
You are on the CC list for the bug.

More information about the libre-soc-bugs mailing list