[Libre-soc-dev] [RFC] svp64 "source zeroing" makes no sense

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sun Mar 21 22:40:01 GMT 2021


On Sunday, March 21, 2021, Richard Wilbur <richard.wilbur at gmail.com> wrote:

> On Sun, Mar 21, 2021 at 2:22 PM Luke Kenneth Casson Leighton
> <lkcl at lkcl.net> wrote:
> >
> > On Sun, Mar 21, 2021 at 8:11 PM Richard Wilbur <richard.wilbur at gmail.com
> >
> > wrote:
> >
> > > The original post made no mention of the file or the language
> >
> >
> > soorryyy :)
>
> I forgive you and thank you for supplying the context.
>
> > ok so does it make more sense, now?
>
> It definitely makes more sense now.  I still have some comments:
>
> What is the maximum value of VL


y4.


>  (or similarly, the maximum length of
> the predication masks)?


integer.  64.


> I have in mind an optimization that avoids
> iterating through the 0 bits of the predication masks with logic that
> will generate the {src|dst}step pointing to the next non-zero bit in
> the mask provided there is at least one bit set.


this will be possible as a choice for individual implementors where it
makes sense based on gate count, performance and power consumption for
their needs.

it will be helpful to record such optimisations for when there is time to
implement them.

right now the priority question is "does ORing the src and dest zeroing to
put zeros in the output make sense"

       if dest_zeroing and ((1<<dststep) & dstmask) == 0):
>             result = 0
>             Condition_Register = EQzero
>        else:
>             if src_zeroing and ((1<<srcstep) & srcmask) == 0:
>                 RA = 0
>                 RB = 0
>             else:
>                 RA = get_register_RA
>                 RB = get_register_RB
>             result, Condition_Register = calc_operation(RA, RB)


this is still the old behaviour which is passing zeros into the pipelines
as input.

this behaviour makes no sense and must be replaced.

4.  If there are un-enumerated side effects that we wish to reproduce
> from calc_operation()


pipelined designs should not have such side effects because they require
complex hazard detection to coordinate and it severely impacts
opportunities for parallelism (performance)

thus, logically, if there is a choice "compromising performance to maintain
some arbitrary side-effect" the side-effect gets quashed with prejudice.




-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


More information about the Libre-soc-dev mailing list