[Libre-soc-isa] [Bug 213] SimpleV Standard writeup needed

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Wed Nov 18 15:20:36 GMT 2020


https://bugs.libre-soc.org/show_bug.cgi?id=213

Luke Kenneth Casson Leighton <lkcl at lkcl.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cand at gmx.com

--- Comment #95 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(moving to bug #213)

(In reply to cand from comment #36)
> I guess we're talking past each other. I'm saying using a lookup table lets
> you save bits. Instead of 4+8 per vec4, you have 8.

let's work out why.

there are two separate and distinct things needed here (both quite normally
provided by swizzle)

1) the ability to select 1, 2, 3 or 4
   parts of a vec4 to perform the
   vector-computation on.  examples:

   # select 2 elements from a XYZW vec4
   fadd v1.XY, v2.WZ, v3.WZ

   # select 3 elements from an ARGB vec4
   fmul v1.RGB, v1.RGB, v3.AAA

   this latter would be for example to 
   linearly multiply the RGB by the 
   transparency (A of ARGB)

2) the ability to select any part of a vec4 to place it into any other position
in a vec4.


question:

   how, if there is only 8 bits, is it possible to specify that some of the
vec4 elements are to be ignored?

you can't... unless there is a predicate mask.

there are 2 possible ways that can be encoded, due to a quirk of swizzle:

1) a 4 bit mask.  the elements in the vec4 with 0 set are ignored, just as with
standard predication.  in fact, it is predication.

2) use an 8 bit swizzle to move things into the right order, even if they are
not to be used... and then set SUBVL to 2 or 3, ignoring the upper elements.

this latter is wasteful.  the mv takes place, taking up register port
bandwwidth, but the values are thrown out? doesn't seem wise to me.

BUT

with SVPrefix containing the SUBVL we have a trick: the SUBVL applies to that
operation, right there, right then.

therefore we *do* have all the information needed to ignore the unneeded bits
of the 2x2x2x2 swizzle mask.

(that means we can actually fit 2 of them into 16 bits for a SV-64 encoding!)

unfortunately though this trick will still require a follow-up MV to get the
altered elements back into their target vec4 positions:

    # select 2 elements from a vec4,
    # making a vec2 temporary
    SUBVL=2 fadd v2.YZ, v3.XW, v4.YW

    # now get the contents of v2 back into
    # v3 XW positions...  errr... how?
    SUBVL=2 fmv v3.XW, v2.YZ

would that actually work? y'know, i think it might.  actually it would be:

    SUBVL=2 fadd v2.YZXX, v3.XWXX, v4.YWXX
    SUBVL=2 fmv v3.XWXX, v2.YZXX

except that because SUBVL=2 the top 2 swizzle indices would be ignored.

what's your thoughts there?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list