[Libre-soc-bugs] [Bug 230] Video opcode development and discussion

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sat Dec 12 11:35:24 GMT 2020


https://bugs.libre-soc.org/show_bug.cgi?id=230

--- Comment #25 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to cand from comment #24)
> Yeah, the full arbitrary-width is not viable. And I know you find
> fixed-width ops ugly, but in this case it may be necessary.
> 
> > ok so there *is* one possibility, there: a special vec4 only operation that takes 4x elements and multiplies them all together, targetting a non-vec4 dest.
> >
> > however these would be very specialist operations, that i would like to defer until "Phase 2" i.e. after the initial implementation of SV looping.
> 
> I realized a horizontal 4-element add would also be useful for the generic
> pixel pack case, since | and + are the same op when no bits overlap. It
> would replace the last three ORs, speeding it up by 1-2 clocks per pixel,
> plus the avoided stall (I'm assuming the horizontal 4-op can avoid the
> normal non-4-offset reg stall).

ok a 64 bit down to 16 or even 8, no problem.  ish.  4x 64 bit, down to 1x 64
bit, this needs a micro-coding FSM to cover multiple operation generation.

it's... doable.

can you add this to the list of ops in the av_opcodes page? spec it up with
some simple pseudocode even if it is 3 lines.


> This then lead to the opposite operation too, a 1-to-4 bit scatter with
> shift and AND.

oof.  ah.  this *might* be covered with bitmanip but again can you spec it up,
add some pseudocode

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list