[Libre-soc-dev] sv.mv x: the instruction from hell

Sat Jun 4 08:50:36 BST 2022

On Sat, Jun 4, 2022 at 1:19 AM Jacob Lifshay <programmerjake at gmail.com> wrote:

> there's a pretty simple fix, make the *scalar* instruction limit itself to VL:
> idx = GPR(RA)
> GPR(RT) = idx < VL ? GPR(RB + idx) : 0

as it's a MV (2-operand) it'd be
   GPR(RT) = idx < VL ? GPR(RT + idx) : 0

which is the solution discussed a while back.  this still makes the
cross-interference from actually modifying RT a problem.
WAR and RAW Hazards are created *in between each scalar element*.

there are strict inviolate rules at play here: SV's inviolate
rule is that the elements are as if they were done as actual
scalar instructions.  therefore with each index being read
and the next instruction potentially having an *index*
modified, the entire sequence basically grinds to a halt.

>> by setting the rule that the Hazards are *NOT* to be observed,
>>during the usage of this type of remap, all of the problems go away.
>
>
> elaborate on what you mean by not to be observed...i don't understand what you mean.

https://www.thesaurus.com/browse/observed

examined followed heeded regarded noted inspected watched checked.

probably the best one is checked.  "not to be checked".

> going off what I do understand, i think it's a pretty bad idea because
> it takes a slow/expensive instruction and just makes it slower and more
> expensive, also you'd need an additional instruction to set the remap
> nearly every time.

it's always going to be awful, because of retrofitting to a scalar ISA.
VSX doesn't have this problem at all because the indices are in
one single VSX register.

the biggest advantage of the remap concept is that we do not
have to propose a scalar mv.x instruction.  given that this is a
prerequisite for being able to use it in SVP64, and given that as a
scalar instruction it's a total nightmare where i would expect the
OPF ISA WG to fight it tooth and nail *and i would agree with them*,
any alternative is better because it can be, how to put it best...
"slipped under the carpet" if you know what i mean, there.

additionally, the whole reason for having the mv.x is so as to
shuffle registers around so that they can be used elsewhere,
e.g. by arithmetic operations.

well... um... if they can be "shuffled" as inputs *to* the arithmetic
operations because the re-indexing applies to one (or more)
of the arithmetic operations' inputs *instead* of using mv.x,
you just saved an instruction.

i.e. the remap becomes in effect the *replacement* for the sv.mv.x
rather than an *additional* instruction.

it's amazing how this isn't completely as straightforward as
it first seems, there's so much to take into consideration.

l.