[Libre-soc-dev] SVP64 bclrl
luke.leighton at gmail.com
Thu Apr 7 12:14:11 BST 2022
On Thu, Apr 7, 2022 at 5:15 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
> we need a SIMD branch unit...
a Micro-Architect *may* decide that, just as with the Dynamic Partitioned
SIMD back-end ALUs, which receive multiple elements and therefore
multiple predicate bits, to create at the back-end one or more SIMD
Branch Units as well.
these would receive - and process - *multiple* CR Fields with *multiple*
predicate bits in a single unit.
these SIMD-Branch-Units would, on receiving the "latest copy of CTR",
perform a count-leading-ones/zeros on the CR Field bits referenced
by BO, mask them out based on the multiple predicate bits received,
and compute - in an entirely deterministic fashion - exactly how far
CTR should be decremented, to cover the [multiple] elements it had
been requested to cover.
such an implementation would reduce pressure on both the size of
the Transitive Shadow Matrix in the OoO Execution Engine as well
as reduce the number of in-flight instructions and allow better resource
the degenerate case (simple single-issue) would be that such a
SIMD-i-fied Branch Unit fits precisely the description that you
gave. the SVP64 Branch Specification allows for much more
sophisticated Micro-Architectural implementation.
More information about the Libre-soc-dev