[Libre-soc-dev] [RFC] svp64 "source zeroing" makes no sense

Richard Wilbur richard.wilbur at gmail.com
Sun Mar 21 23:16:03 GMT 2021


On Sun, Mar 21, 2021 at 4:40 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>
> On Sunday, March 21, 2021, Richard Wilbur <richard.wilbur at gmail.com> wrote:
> > What is the maximum value of VL
>
> y4.
>
> >  (or similarly, the maximum length of
> > the predication masks)?
>
> integer.  64.
>
> > I have in mind an optimization that avoids
> > iterating through the 0 bits of the predication masks with logic that
> > will generate the {src|dst}step pointing to the next non-zero bit in
> > the mask provided there is at least one bit set.

The optimization is very simple.  How sparse do you expect these
predication masks to be?

>
> this will be possible as a choice for individual implementors where it
> makes sense based on gate count, performance and power consumption for
> their needs.
>
> it will be helpful to record such optimisations for when there is time to
> implement them.

I'm happy to do that.  It's just so simple that I was thinking it
sounded like an easy win if we expect predication masks to be fairly
sparse as you save the cycles every time you perform an instruction.

> right now the priority question is "does ORing the src and dest zeroing to
> put zeros in the output make sense"

With the code, I would suggest:

if dest_zeroing and src_zeroing:
    dstmask &= srcmask

>
>        if dest_zeroing and ((1<<dststep) & dstmask) == 0):
> >             result = 0
> >             Condition_Register = EQzero
> >        else:
> >             if src_zeroing and ((1<<srcstep) & srcmask) == 0:
> >                 RA = 0
> >                 RB = 0
> >             else:
> >                 RA = get_register_RA
> >                 RB = get_register_RB
> >             result, Condition_Register = calc_operation(RA, RB)
>
>
> this is still the old behaviour which is passing zeros into the pipelines
> as input.
>
> this behaviour makes no sense and must be replaced.

I guess that's because I don't understand the intent.  To me, source
zeroing just passes 0's into whatever you were going to do.

> 4.  If there are un-enumerated side effects that we wish to reproduce
> > from calc_operation()
>
>
> pipelined designs should not have such side effects because they require
> complex hazard detection to coordinate and it severely impacts
> opportunities for parallelism (performance)
>
> thus, logically, if there is a choice "compromising performance to maintain
> some arbitrary side-effect" the side-effect gets quashed with prejudice.

Good!



More information about the Libre-soc-dev mailing list