[Libre-soc-bugs] [Bug 238] POWER Compressed Formal Standard writeup

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Wed Nov 18 11:02:26 GMT 2020


https://bugs.libre-soc.org/show_bug.cgi?id=238

--- Comment #31 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #28)
> (In reply to cand from comment #27)
> > Why would you need 8 bits to specify the order of four elements though? 4! =
> > 24, which is representable in 5 bits. This does need a lookup table in hw,
> > but surely those are fast and cheap at that size.
> 
> because it's not just specifying order, you can have swizzles that duplicate
> some input elements and skip other elements.

mask: 0bXYZW  select: vec4[0] 0bXX vec4[1] 0bYY vec4[2] 0bZZ vec4[3] 0bWW

total of 12 bits... *per vec4* src1/src2/dest (unfortunately)

the mask is actually predicate bits, applied to sub-elements (SUBVL),
hypothetically we could add this capability to SV, thus only needing
8 bits for selecting from the vec4, but normally it's
encoded into a constant/immediate/field in the instruction.

if only a permutation was needed the encoding could be done in less bits
but it is reasonable to have e.g. a vec4 select src1=all-X (4 copies of X)
applied all to a vec4 src2=XYZW

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list