[Libre-soc-dev] CR-result-driven predication
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Sun Dec 13 01:49:11 GMT 2020
https://libre-soc.org/openpower/sv/svp_rewrite/svp64/discussion/?updated
in the Modes section i had an idea that will be very simple to
implement and works well in parallel processing.
this stems originally from data-dependent fail-on-first, where the
first step (first enhancement) was: to test the data for being
zero/nonzero and to truncate the vector at the first failed test.
thus the result vector only comprises a sequence of elements where the
test suceeded.
the next enhancement, only possible on OpenPOWER, was to include CR
testing. when Rc=1 a CR is generated: allow then for testing of the
CR bits in exactly the same way as branches and activate ffirst based
not exclusively on zero/nonzero but eq, lt, gt etc. as well.
however to implement all this is a pain in the neck, requiring
speculative execution (shadows) followed by post-analysis, altering
VL, and holding up execution whilst waiting for the new VL.
what if, then, a new mode was added that performed the exact same
testing, but pushed that test result *into the predication*? i.e. the
test was ANDed with the predicate bit and thus stopped the result from
being written (or caused a zero to be written when zero-mode is
active).
the huge advantage of this is that, unlike ffirst, there is no hold-up
by way of post-analysis on the chain of results, to work out a new VL.
in this new mode the testing, the ANDing with predicate, is all
parallel and independent.
whilst zero/nonzero might not sound useful, some of the arithmetic
ones would almost certainly be. imagine doing a signed ADD where
only results greater than zero were placed in the destination.
the hilarious thing is that all the pieces are already in place: CRs
and predication. it is the mixing of these two that makes for some
powerful savings.
l.
More information about the Libre-soc-dev
mailing list