[Libre-soc-dev] scalar instructions and SVP64

Wed Mar 10 03:52:09 GMT 2021

On Wednesday, March 10, 2021, Jacob Lifshay <programmerjake at gmail.com>
wrote:

>
> https://libre-soc.org/irclog/%23libre-soc.2021-03-10.log.
> html#t2021-03-10T01:14:27
>
> I guess you can summarize what I envision for svp64 as: prefix for
> accessing stuff added with SV. one of the features is vector operation,
> another independent feature is accessing registers > r31, another one is
> predication, and so on.

yeah, i understand.  it explains the many ideas that you suggest, all of
which break the underlying simplicity unfortunately [which then in turn
make explaining SV harder *and* make implementation.harder in simulators,
compilers and HDL]

every idea that you come up with that is based on this misapprehension of
the fundamental principle creates an interaction between the prefix and its
suffix that makes comprehension harder, implementation harder, and many
other detrimental implications.

even just the idea "why not have scalar-in-SV-when-VL=0" means that
hardware can no longer rely on VL=0 to skip SVP64 operations, it has to do
a FULL decode in order to determine which registers are marked as scalar
before being able to determine if it can be skipped.

the rule is: when SV activates, it activates the for-loop.

that's it.

that's why it's called SimpleV, not ComplexV.

the complete lack of connection and interaction between the for-loop and
the v3.0B instruction *within* that for-loop is very, very deliberate.

remember Tim Forsyth's talk.  you have to think *everything* through,
together.

> using svp64 doesn't automatically enable everything always, so why should
> vector/scalar be any different?

i'm having difficulty parsing this within the context of "VL is an
abstracted independent for-loop around scalar instructions".

the prefix bits (modes) *augment* the scalar behaviour.  the ones i am not
too happy about are the ones that create hazards (mapreduce, ffirst).  but,
their benefit is clear.

the others (saturate, pred-result) augment the *individual* scalar
instructions, those modes get passed to MULTIPLE independent parallel
execution engines.

remember, it's a Sub Program Counter.  you don't go "if PC ==
arbitrarymagicconstant then instruction behaviour equals different"

ok for LD/ST in some architectures you do :)

l.

-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68