[Libre-soc-dev] [RFC] svp64 "source zeroing" makes no sense
Richard Wilbur
richard.wilbur at gmail.com
Sun Mar 21 22:30:32 GMT 2021
On Sun, Mar 21, 2021 at 2:22 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>
> On Sun, Mar 21, 2021 at 8:11 PM Richard Wilbur <richard.wilbur at gmail.com>
> wrote:
>
> > The original post made no mention of the file or the language
>
>
> soorryyy :)
I forgive you and thank you for supplying the context.
> ok so does it make more sense, now?
It definitely makes more sense now. I still have some comments:
What is the maximum value of VL (or similarly, the maximum length of
the predication masks)? I have in mind an optimization that avoids
iterating through the 0 bits of the predication masks with logic that
will generate the {src|dst}step pointing to the next non-zero bit in
the mask provided there is at least one bit set.
if not src_zeroing:
srcstep += skip_zeros(srcmask, srcstep, vl)
if (srcstep == vl): break
if not dest_zeroing:
dststep += skip_zeros(dstmask, dststep, vl)
if (dststep == vl): break
if dest_zeroing and ((1<<dststep) & dstmask) == 0):
result = 0
Condition_Register = EQzero
else:
if src_zeroing and ((1<<srcstep) & srcmask) == 0:
RA = 0
RB = 0
else:
RA = get_register_RA
RB = get_register_RB
result, Condition_Register = calc_operation(RA, RB)
store_result(result)
if Rc=1: store_cr(Condition_Register)
1. This changes the loops that skip zeros in predication masks into
an update of {src|dst}step by a fast logic block represented by
skip_zeros().
2. Here I have compared 1<<dststep to dstmask instead of srcmask.
3. It seems likely that we would like to perform the operation with
src arguments 0 if the flag src_zeroing is set and the srcmask
specifies.
4. If there are un-enumerated side effects that we wish to reproduce
from calc_operation() (aside from [result, Condition_Register]) then
we will likely want to perform the operation even though we are
zeroing the destination.
More information about the Libre-soc-dev
mailing list