[Libre-soc-dev] sv.mv x: the instruction from hell

lkcl luke.leighton at gmail.com
Sat Jun 4 09:44:16 BST 2022

* instead of LD-sequential followed by sv.mv.x just do index remap on the LD
* the element indices could overlap i.e. be overwritten by a previous element indexed-mv
   a rule could be set "undefined behaviour" but it is no different
  from setting the rule "ignore hazards" on indexed-remap
* the big difference however is that a bad instruction (as scalar)
   is made even worse by that undefined behaviour rule.

to add mv.x it would have to be proposed:

    "please accept this 32 bit scalar instruction that nobody in their
      right mind would ever use because it creates catastrophic
      read/write hazards.  oh and even when vectorised we have
      to propose undefined behaviour"


     "please accept this slightly less optimal solution which can
       save register resources, hides the undefined behaviour
       behind an index-remap instruction, and doesn't need a
       32-bit instruction from hell (mv.x) to accompany it"

the latter is clearly a much cleaner proposal.




    sv.fmv (or any other mv instruction including converters)


     svremap.indexed RA


     svremap.indexed RB
     svremap.indexed RS


      svremap.indexed RA
      sv.addi RT.v RA.v, 5

the use of *double* indexing is err where it gets fun/hilarious/obtuse.  it's even technically possible to do this:

      svremap.indexed RA
      svremap.indexed RB
      svremap.indexed RT
      sv.add RT.v RA.v, RB.v

it becomes a "get out of jail free" card for any types of operations which are more complex than the current hardware-for-loop remaps (matrix, DCT, FFT, which took 8 weeks to do)

need a triangular remap? no problem, pre-create the indices and use them as offsets with remap.indexed.

bottom line is, remap.indexed fits much better with the SV paradigm, because it abstracts out the *concept* of indexing.

oh.  i just realised.  we would also need to propose fmv.x not just mv.x and it would suffer the exact same flaws.  whereas remap.indexed can apply to *all* the GPR<->FPR interchangers.

there's no way i would be comfortable proposing a faulty unusable suite of scalar fmv.x and associated GPR-FPR{.x} instructions.

does that help explain?


