[Libre-soc-bugs] [Bug 550] binutils support needed for svp64

Mon Dec 21 20:09:51 GMT 2020

https://bugs.libre-soc.org/show_bug.cgi?id=550

--- Comment #2 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Alexandre Oliva from comment #1)
> I'm a little puzzled (not just because I can hardly make head from tail of
> the svp64 web page :-)
> 
> why bother with "svp64 0x..." syntax, if we already have .long?

yes jacob pointed that out... although... an "svp64 0xNNNNNNN" instruction
would help you to understand the "first phase": where the RM field fits.

> 
> as for making sense of the page.  I guess it must all make some sense if you
> have some vague notion of what the prefixes are supposed to accomplish, but
> that's not me.

SV - aka SimpleV - is a hardware for-loop around instructions.

that's it.  full stop.

here is some pseudocode that shows what that looks like, using ADD as an
example:

https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=simple_v_extension/simple_v_chennai_2018.tex;hb=HEAD#l190

>  I could use some examples, or pointers to earlier, more
> complete and self-contained docs that would give me some sense of what's
> supposed to be going on there.

this paragraph puts the above one-liner and the pseudocode into context:

https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=simple_v_extension/specification.mdwn;hb=HEAD#l38

> not that I really need to be able to make sense of it before I can implement
> binutils changes, mind you; it just helps avoid silly mistakes, and wrong
> assumptions, and I figured I might be able to help validate the proposed
> design, if only I had the required background. 

appreciated.

> alas, I suppose I'm missing
> background on GPUs, ppc 3.1 opcodes, and the earlier simd design for risc-v

there was no SIMD ISA: SV is *categorically* and very specifically
diametrically opposed to SIMD.

SIMD is considered harmful:
https://www.sigarch.org/simd-instructions-considered-harmful/

x86 expanded from 70 to *1400* instructions since 1978, thanks to SIMD (far,
far more since adding AVX512.  SIMD is an O(N^6) opcode proliferation
nightmare.

also we are not adding v3.1B opcodes (that is a separate discussion which
requires OPF permission). the sole exclusive reason for using EXT01 is to get
the "fitting in" with v3.1B 64 bit prefixing in a nondisruptive fashion that
the OPF ISA WG should not have any objection to.

the sigarch article shows how RVV works.  SV is based on the exact same
underlying principle: you have an instruction, you have a vector loop on that
instruction, elements are computed based on that instruction.

full stop.

it's real simple.

VL in our case can be anywhere from 1 to 64.  *very rarely* it is permitted to
be zero.

so how do we set this "VL" or vector length?

well, with an instruction of course.
https://libre-soc.org/openpower/sv/setvl/

and... err... then what?  well, no standard 32 bit scalar instructions do
anything: they don't "understand" VL.

so we "Prefix" them.  this says, "hey you know that VL for-loop you want
applied? well the next 32 bits contains the instruction to be smashed into that
for-loop, oh and by the way here's some other random trash to chuck at the
loop, such as predication, blah blah".

therefore, ultimately, we want this kind of syntax:

    setvl r3, r5, VL=4
    SUBVL=2, ELWIDTH=8 { add r5, r5, r2 }

the output will be:

* 32 bits containing an instruction for setvl
* 32 bits starting with EXT01 as its Major Opcode and continuing with the
pattern that drops SUBVL=2 and ELWIDTH=8 somewhere into the RM field bits
* 32 bits containing an addi instruction

this will get us that hardware for-loop activated 4 times (0-3) on that add
instruction.

actually 8 because SUBVL=2

and, actually, it will be 8bit adds not 64bit adds because ELWIDTH=8.

does that provide you with a quick crash-course in how SV works?

-- 
You are receiving this mail because:
You are on the CC list for the bug.