[Libre-soc-dev] CR-result-driven predication

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sun Dec 13 01:49:11 GMT 2020


in the Modes section i had an idea that will be very simple to
implement and works well in parallel processing.

this stems originally from data-dependent fail-on-first, where the
first step (first enhancement) was: to test the data for being
zero/nonzero and to truncate the vector at the first failed test.

thus the result vector only comprises a sequence of elements where the
test suceeded.

the next enhancement, only possible on OpenPOWER, was to include CR
testing.  when Rc=1 a CR is generated: allow then for testing of the
CR bits in exactly the same way as branches and activate ffirst based
not exclusively on zero/nonzero but eq, lt, gt etc. as well.

however to implement all this is a pain in the neck, requiring
speculative execution (shadows) followed by post-analysis, altering
VL, and holding up execution whilst waiting for the new VL.

what if, then, a new mode was added that performed the exact same
testing, but pushed that test result *into the predication*?  i.e. the
test was ANDed with the predicate bit and thus stopped the result from
being written (or caused a zero to be written when zero-mode is

the huge advantage of this is that, unlike ffirst, there is no hold-up
by way of post-analysis on the chain of results, to work out a new VL.
in this new mode the testing, the ANDing with predicate, is all
parallel and independent.

whilst zero/nonzero might not sound useful, some of the arithmetic
ones would almost certainly be.   imagine doing a signed ADD where
only results greater than zero were placed in the destination.

the hilarious thing is that all the pieces are already in place: CRs
and predication. it is the mixing of these two that makes for some
powerful savings.


More information about the Libre-soc-dev mailing list