[Libre-soc-bugs] [Bug 238] POWER Compressed Formal Standard writeup
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Wed Nov 18 11:02:26 GMT 2020
https://bugs.libre-soc.org/show_bug.cgi?id=238
--- Comment #31 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #28)
> (In reply to cand from comment #27)
> > Why would you need 8 bits to specify the order of four elements though? 4! =
> > 24, which is representable in 5 bits. This does need a lookup table in hw,
> > but surely those are fast and cheap at that size.
>
> because it's not just specifying order, you can have swizzles that duplicate
> some input elements and skip other elements.
mask: 0bXYZW select: vec4[0] 0bXX vec4[1] 0bYY vec4[2] 0bZZ vec4[3] 0bWW
total of 12 bits... *per vec4* src1/src2/dest (unfortunately)
the mask is actually predicate bits, applied to sub-elements (SUBVL),
hypothetically we could add this capability to SV, thus only needing
8 bits for selecting from the vec4, but normally it's
encoded into a constant/immediate/field in the instruction.
if only a permutation was needed the encoding could be done in less bits
but it is reasonable to have e.g. a vec4 select src1=all-X (4 copies of X)
applied all to a vec4 src2=XYZW
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list