[Libre-soc-bugs] [Bug 230] Video opcode development and discussion

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sat May 29 09:40:13 BST 2021


https://bugs.libre-soc.org/show_bug.cgi?id=230

--- Comment #73 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to cand from comment #71)
> So in starting the mp3 SV code, I hit one of ppc's obvious lacks: it has no
> move GPR to/from FPR instrs. This means that zeroing a FPR requires a memory
> load, and that zero takes memory space too.

ahh that's really important.  the SVP64 swizzle allows load of 1.0 as well
as 0.0 as constants.  we considered adding Pi as well, however ran out of
space in the available bitfield.

Jeff Bush points out the cost of memory loads in his nyuzi paper: it's
expensive, particularly for GPU workloads (128 registers).

also the memory path results in significant numbers of cycles, this in turn
"backs up" a much larger quantity of "In-Flight" resources in an Out-of-Order
system.

> I think they did that because loading immediates sucks for them. That is not
> a concern for us, I believe we (will) have efficient immediate loading, no?
> In that case, I propose we add mtf/mff. A standalone "zero FPR" instr would
> work too, but more flexibility is better.

agreed.

> Modern POWER with altivec uses the vector xor instruction for that, since
> the FPR space is shared with the vector register space. They have no "move
> GPR to/from VR" either.

Paul Mackerras kindly pointed out that v2.06 apparently had it, however it
was quickly removed (v2.07)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list