[Libre-soc-dev] [RFC] horizontal SVP64 vectors
whygee at f-cpu.org
whygee at f-cpu.org
Thu Jul 8 13:22:32 BST 2021
Hi Luke,
On 2021-07-08 13:18, Luke Kenneth Casson Leighton wrote:
> On 7/8/21, Richard Wilbur <richard.wilbur at gmail.com> wrote:
>
>> So by “Horizontal Vectorisation” are you referring to running a list
>> of
>> instructions on particular vector elements (the inside of the
>> innermost loop
>> in Cooley-Tukey for example) then moving to the next vector elements
>> (possibly determined by a REMAP and some SHAPE registers) and
>> repeating?
>
> yes, exactly.
>
> more later,
>
> first, jacob, i thought overnight about what you said, and basically
> for elwidth overrides the backend gets hit by a stack of 8 bit element
> 0 operations then a batch of el1 then el2 and yes, to sort that out
> buffering is needed.
>
> however that's an implementor's problem not an API problem, that
> allows different companies to compete on performance.
It's funny because it is another way of looking at
https://hackaday.io/project/8774-f-cpu/log/187267-f-cpu-as-a-decent-vector-processor
In my case, I simply "stick" the instructions of each of the separate
pipelines during a hardware loop,
so they keep operating the same opcode, 4 separately and in parallel,
BUT
by writing to destination registers in another pipeline, I let them
communicate and
stream data from one pipeline to another.
No weird exception handling, no crazy scheduling involved.
I just need to define a prefix instruction that will manage the loop
count
and pointer auto-updates. The other cool thing is that the vectors are
mapped
to registers through the register-mapped memory : the vectors can be ANY
length
and reside in cache, instead of requiring crazy numbers of registers...
Of course the limitation is that the "vector operations" are limited to
4 arithmetic
operations in series/parallel/at once but the memory aspect would be
pretty efficient.
Remember : what would Seymour do ? :-D
In fact I suspect that's close to how he did vector bypass in the
Cray-1.
"just stick the instructions in place in the buffer" is a pretty simple
method.
> l.
yg
More information about the Libre-soc-dev
mailing list