[Libre-soc-bugs] [Bug 230] Video opcode development and discussion

Tue Dec 15 17:08:11 GMT 2020

https://bugs.libre-soc.org/show_bug.cgi?id=230

--- Comment #57 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to cand from comment #56)
> The immediate required for same-width is just 5/7 bits. 1 bit for BE/LE, 4
> bits for 1-16, 6 for full 64 range.

yeah i think going beyond 1-16 is not sane to try to fit, here.

... click... ah HA!

   rlwimi RA,RS,SH,MB,ME (Rc=0)
   rlwimi. RA,RS,SH,MB,ME (Rc=1)

https://libre-soc.org/openpower/isa/fixedshift/

half of those are 1-in 1-out! that makes them candidates for adding into the
Monster Shift Register FSM after all! now we are cooking with gas.

right.

so two things:

1) if the shift-mask schedule for each Vulkan Texture can be encoded into an
algorithm, sequentially programming different values of sh, mb and me into the
shifters, we have all VK Texture ops covered in micro-coding style.

   (the only complexity here is 12bit
    but this is not insurmountable)

2) if we can work out how to augment the shift-mask schedule by taking the
SUBVL index as some sort of augmenter, we can probably cover the majority of
pixel/audio formats there.

methods include:

2a) macro op fusion.  R gets its own rlwimi, G gets its own rlwimi, B likewise,
A likewise

    (that's a big macro op sequence though)

2b) an SPR specifying the augmentation schedule

    (an SPR as input to the Monster FSM
     is kinda ok)

2c) somehow polluting or giving alternative meaning to the 24 SV prefix bits.

    (you can tell i don't like this one
     it means the SV Prefix Decoder needs
     to know the operation)

2d) an algorithmic augmentation which otherwise keeps the exact same sh, mb, me
constants

    for j in range(SUBVL):
        n <- SH * (j*something)
        r <- ROTL32((RS)[32:63], n)
        m <- MASK(MB+32, ME+32)
        RA <- r&m | (RA) & ¬m

sonething that offsets sh, mb and me by some amount.

OR

2e) because we know that the pixels are in 16 bit (1555) then from the elwidth
we *know* that high bits of sh, mb and me will not be used.  it *might* be
possible to use those to provide the "augmentation"

    (i like this route although it will
     take time to analyse. sigh)

-- 
You are receiving this mail because:
You are on the CC list for the bug.