[Libre-soc-bugs] [Bug 228] VP9 optimizations
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Mon Sep 26 07:55:14 BST 2022
https://bugs.libre-soc.org/show_bug.cgi?id=228
--- Comment #5 from Konstantinos Margaritis <konstantinos at vectorcamp.gr> ---
Ok, SVP64 of a few variance functions used in VP9 should be completed in commit
https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=101e3a30f90f567eaa2b7f5f7fd2306a04bfcad4
As a short explanation, the way I did it was to implement some glue C code that
would call the Python Simulator (pypowersim) and some wrapper functions that
would be called from the actual VP9 testsuite. When the wrapper function is
called it will gather the arguments and memory that the function needs using
CPython API and it will then call the function. When done, it will take the
result object from the simulator and retrieve the memory and/or registers that
the functions would expect the results in and again using the CPython API
return the results to the caller -in this case the VP9 testsuite.
This enabled a number of often called functions for the VP8/VP9 codec to be
converted to SVP64 assembly, without even having access to the hardware. But it
is really slow, so I had to lower the number of iterations much lower than the
actual one.
Tests can be run by make all and then running
$ ./libvpx_variance_test
[==========] Running 104 tests from 8 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from C/VpxSseTest
[ RUN ] C/VpxSseTest.RefSse/0
[ OK ] C/VpxSseTest.RefSse/0 (0 ms)
[ RUN ] C/VpxSseTest.MaxSse/0
[ OK ] C/VpxSseTest.MaxSse/0 (0 ms)
[----------] 2 tests from C/VpxSseTest (1 ms total)
[----------] 2 tests from SVP64/VpxSseTest
[ RUN ] SVP64/VpxSseTest.RefSse/0
[ OK ] SVP64/VpxSseTest.RefSse/0 (55276 ms)
[ RUN ] SVP64/VpxSseTest.MaxSse/0
[ OK ] SVP64/VpxSseTest.MaxSse/0 (19063 ms)
[----------] 2 tests from SVP64/VpxSseTest (74339 ms total)
[----------] 8 tests from C/VpxMseTest
[ RUN ] C/VpxMseTest.RefMse/0
[ OK ] C/VpxMseTest.RefMse/0 (0 ms)
[ RUN ] C/VpxMseTest.RefMse/1
[ OK ] C/VpxMseTest.RefMse/1 (0 ms)
[ RUN ] C/VpxMseTest.RefMse/2
[ OK ] C/VpxMseTest.RefMse/2 (0 ms)
[ RUN ] C/VpxMseTest.RefMse/3
[ OK ] C/VpxMseTest.RefMse/3 (0 ms)
[ RUN ] C/VpxMseTest.MaxMse/0
[ OK ] C/VpxMseTest.MaxMse/0 (0 ms)
[ RUN ] C/VpxMseTest.MaxMse/1
[ OK ] C/VpxMseTest.MaxMse/1 (0 ms)
[ RUN ] C/VpxMseTest.MaxMse/2
[ OK ] C/VpxMseTest.MaxMse/2 (0 ms)
[ RUN ] C/VpxMseTest.MaxMse/3
[ OK ] C/VpxMseTest.MaxMse/3 (0 ms)
[----------] 8 tests from C/VpxMseTest (0 ms total)
[ RUN ] SVP64/VpxMseTest.RefMse/0
[ OK ] SVP64/VpxMseTest.RefMse/0 (611909 ms)
[ RUN ] SVP64/VpxMseTest.RefMse/1
[ OK ] SVP64/VpxMseTest.RefMse/1 (326659 ms)
[ RUN ] SVP64/VpxMseTest.RefMse/2
[ OK ] SVP64/VpxMseTest.RefMse/2 (340466 ms)
[ RUN ] SVP64/VpxMseTest.RefMse/3
[ OK ] SVP64/VpxMseTest.RefMse/3 (193487 ms)
[ RUN ] SVP64/VpxMseTest.MaxMse/0
[ OK ] SVP64/VpxMseTest.MaxMse/0 (209029 ms)
[ RUN ] SVP64/VpxMseTest.MaxMse/1
[ RUN ] SVP64/VpxMseTest.MaxMse/3
[ OK ] SVP64/VpxMseTest.MaxMse/3 (66713 ms)
[----------] 8 tests from SVP64/VpxMseTest (1976552 ms total)
[----------] 40 tests from C/VpxVarianceTest
[ RUN ] C/VpxVarianceTest.Zero/0
[ OK ] C/VpxVarianceTest.Zero/0 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/1
[ OK ] C/VpxVarianceTest.Zero/1 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/2
[ OK ] C/VpxVarianceTest.Zero/2 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/3
[ OK ] C/VpxVarianceTest.Zero/3 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/4
[ OK ] C/VpxVarianceTest.Zero/4 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/5
[ OK ] C/VpxVarianceTest.Zero/5 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/6
[ OK ] C/VpxVarianceTest.Zero/6 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/7
[ OK ] C/VpxVarianceTest.Zero/7 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/8
[ OK ] C/VpxVarianceTest.Zero/8 (0 ms)
[ RUN ] C/VpxVarianceTest.Zero/9
[ OK ] C/VpxVarianceTest.Zero/9 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/0
[ OK ] C/VpxVarianceTest.Ref/0 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/1
[ OK ] C/VpxVarianceTest.Ref/1 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/2
[ OK ] C/VpxVarianceTest.Ref/2 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/3
[ OK ] C/VpxVarianceTest.Ref/3 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/4
[ OK ] C/VpxVarianceTest.Ref/4 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/5
[ OK ] C/VpxVarianceTest.Ref/5 (1 ms)
[ RUN ] C/VpxVarianceTest.Ref/6
[ OK ] C/VpxVarianceTest.Ref/6 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/7
[ OK ] C/VpxVarianceTest.Ref/7 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/8
[ OK ] C/VpxVarianceTest.Ref/8 (0 ms)
[ RUN ] C/VpxVarianceTest.Ref/9
[ OK ] C/VpxVarianceTest.Ref/9 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/0
[ OK ] C/VpxVarianceTest.RefStride/0 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/1
[ OK ] C/VpxVarianceTest.RefStride/1 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/2
[ OK ] C/VpxVarianceTest.RefStride/2 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/3
[ OK ] C/VpxVarianceTest.RefStride/3 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/4
[ OK ] C/VpxVarianceTest.RefStride/4 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/5
[ OK ] C/VpxVarianceTest.RefStride/5 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/6
[ OK ] C/VpxVarianceTest.RefStride/6 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/7
[ OK ] C/VpxVarianceTest.RefStride/7 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/8
[ OK ] C/VpxVarianceTest.RefStride/8 (0 ms)
[ RUN ] C/VpxVarianceTest.RefStride/9
[ OK ] C/VpxVarianceTest.RefStride/9 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/0
[ OK ] C/VpxVarianceTest.OneQuarter/0 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/1
[ OK ] C/VpxVarianceTest.OneQuarter/1 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/2
[ OK ] C/VpxVarianceTest.OneQuarter/2 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/3
[ OK ] C/VpxVarianceTest.OneQuarter/3 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/4
[ OK ] C/VpxVarianceTest.OneQuarter/4 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/5
[ OK ] C/VpxVarianceTest.OneQuarter/5 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/6
[ OK ] C/VpxVarianceTest.OneQuarter/6 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/7
[ OK ] C/VpxVarianceTest.OneQuarter/7 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/8
[ OK ] C/VpxVarianceTest.OneQuarter/8 (0 ms)
[ RUN ] C/VpxVarianceTest.OneQuarter/9
[ OK ] C/VpxVarianceTest.OneQuarter/9 (0 ms)
[----------] 40 tests from C/VpxVarianceTest (1 ms total)
[----------] 40 tests from SVP64/VpxVarianceTest
[ RUN ] SVP64/VpxVarianceTest.Zero/0
[ OK ] SVP64/VpxVarianceTest.Zero/0 (3115258 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/1
[ OK ] SVP64/VpxVarianceTest.Zero/1 (1599237 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/2
[ OK ] SVP64/VpxVarianceTest.Zero/2 (1632482 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/3
[ OK ] SVP64/VpxVarianceTest.Zero/3 (866733 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/4
[ OK ] SVP64/VpxVarianceTest.Zero/4 (488774 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/5
[ OK ] SVP64/VpxVarianceTest.Zero/5 (506917 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/6
[ OK ] SVP64/VpxVarianceTest.Zero/6 (315762 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/7
[ OK ] SVP64/VpxVarianceTest.Zero/7 (220856 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/8
[ OK ] SVP64/VpxVarianceTest.Zero/8 (235377 ms)
[ RUN ] SVP64/VpxVarianceTest.Zero/9
[ OK ] SVP64/VpxVarianceTest.Zero/9 (189669 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/0
[ OK ] SVP64/VpxVarianceTest.Ref/0 (2390526 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/1
[ OK ] SVP64/VpxVarianceTest.Ref/1 (1254458 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/2
[ OK ] SVP64/VpxVarianceTest.Ref/2 (1280849 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/3
[ OK ] SVP64/VpxVarianceTest.Ref/3 (700744 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/4
[ OK ] SVP64/VpxVarianceTest.Ref/4 (414212 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/5
[ OK ] SVP64/VpxVarianceTest.Ref/5 (428765 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/6
[ OK ] SVP64/VpxVarianceTest.Ref/6 (280934 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/7
[ OK ] SVP64/VpxVarianceTest.Ref/7 (210813 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/8
[ OK ] SVP64/VpxVarianceTest.Ref/8 (219275 ms)
[ RUN ] SVP64/VpxVarianceTest.Ref/9
[ OK ] SVP64/VpxVarianceTest.Ref/9 (183868 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/0
[ OK ] SVP64/VpxVarianceTest.RefStride/0 (2431067 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/1
[ OK ] SVP64/VpxVarianceTest.RefStride/1 (1291792 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/2
[ OK ] SVP64/VpxVarianceTest.RefStride/2 (1313705 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/3
[ OK ] SVP64/VpxVarianceTest.RefStride/3 (741109 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/4
[ OK ] SVP64/VpxVarianceTest.RefStride/4 (454184 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/5
[ OK ] SVP64/VpxVarianceTest.RefStride/4 (454184 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/5
[ OK ] SVP64/VpxVarianceTest.RefStride/5 (467841 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/6
[ OK ] SVP64/VpxVarianceTest.RefStride/6 (324436 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/7
[ OK ] SVP64/VpxVarianceTest.RefStride/7 (254069 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/8
[ OK ] SVP64/VpxVarianceTest.RefStride/8 (263426 ms)
[ RUN ] SVP64/VpxVarianceTest.RefStride/9
[ OK ] SVP64/VpxVarianceTest.RefStride/9 (229721 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/0
[ OK ] SVP64/VpxVarianceTest.OneQuarter/0 (826340 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/1
[ OK ] SVP64/VpxVarianceTest.OneQuarter/1 (443795 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/2
[ OK ] SVP64/VpxVarianceTest.OneQuarter/2 (450765 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/3
[ OK ] SVP64/VpxVarianceTest.OneQuarter/3 (258029 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/4
[ OK ] SVP64/VpxVarianceTest.OneQuarter/4 (160789 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/5
[ OK ] SVP64/VpxVarianceTest.OneQuarter/5 (164404 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/6
[ OK ] SVP64/VpxVarianceTest.OneQuarter/6 (115362 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/7
[ OK ] SVP64/VpxVarianceTest.OneQuarter/7 (91212 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/8
[ OK ] SVP64/VpxVarianceTest.OneQuarter/8 (93363 ms)
[ RUN ] SVP64/VpxVarianceTest.OneQuarter/9
[ OK ] SVP64/VpxVarianceTest.OneQuarter/9 (80609 ms)
[----------] 40 tests from SVP64/VpxVarianceTest (26991529 ms total)
[----------] 2 tests from C/SumOfSquaresTest
[ RUN ] C/SumOfSquaresTest.Const/0
[ OK ] C/SumOfSquaresTest.Const/0 (0 ms)
[ RUN ] C/SumOfSquaresTest.Ref/0
[ OK ] C/SumOfSquaresTest.Ref/0 (0 ms)
[----------] 2 tests from C/SumOfSquaresTest (1 ms total)
[----------] 2 tests from SVP64/SumOfSquaresTest
[ RUN ] SVP64/SumOfSquaresTest.Const/0
[ OK ] SVP64/SumOfSquaresTest.Const/0 (636899 ms)
[ RUN ] SVP64/SumOfSquaresTest.Ref/0
[ OK ] SVP64/SumOfSquaresTest.Ref/0 (649444 ms)
[----------] 2 tests from SVP64/SumOfSquaresTest (1286343 ms total)
[----------] Global test environment tear-down
[==========] 104 tests from 8 test suites ran. (30328767 ms total)
[ PASSED ] 104 tests.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list