[Libre-soc-dev] [RFC] Vector/SIMD ISA Context Abstraction

Sat Jul 31 12:50:59 BST 2021

(taking llvm-dev off this one)

On 7/31/21, Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:

> there are also two modes of operation:
>
> * Vertical First which requires explicit incrementing of the Vector
> Element offset (effectively turning the register file into an
> indexable SRAM)

Alexandre, this may help considerably in actually understanding what
SVP64 is about (for gcc) and may be much easier to implement,
initially.

the src/dst element offset can be tied directly to a
sequentially-incrementing loop variable in loop autovectorisation.

the next phase in understanding (and the next advancement in
autovectorisation) is to allow multiple of those elements to be
executed in parallel (as chosen by the hardware, and communicated back
to the loop incrementation at runtime).

basically, instead of

    for (i=0; i<n; i++)

it would be

   for (i=0; i<n; i+= SVSTATE.srcstep)

where the autovectorisation would recognise that elements
i..i+srcstep-1 would be executed in parallel

and the next step in understanding and in autovectorisation is that
when the amount chosen by hardware to be executed in parallel is equal
to VL, each instruction executes ALL elements of the Vector before
moving on to the next element, and this is the mode that Cray and RVV
implement.

l.