[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode, batch processing
lkcl
luke.leighton at gmail.com
Fri Aug 13 16:14:43 BST 2021
On August 12, 2021 10:14:48 PM UTC, lkcl <luke.leighton at gmail.com> wrote:
>
>
>On August 12, 2021 9:37:16 PM UTC, Richard Wilbur
><richard.wilbur at gmail.com> wrote:
>>> i propose this change to:
>>>
>>> if HorizontalFirst
>>> if srcstep < VL
>>> srstsep increments
>>> else if VerticalFirst
>>> if srcstep < *MAXVL*
>>> srcstep increments
>>>
>>> questions, comments?
>>
>>Sounds like a good thing.
>
>my only concern is, should MVL be restricted to an immediate (for
>VFirst mode) or should it be allowed to be set via a register (RA).
>
>whilst the logic behind making MVL compile-time static for Horizontal
>Mode is obvious, i haven't got my head round Vertical Mode yet.
Horizontal-First, you perform these types of loops:
setmaxvli 8
loop:
setvl r5, r3 # VL=r5=MAX(MVL, r3)
sv.ld r20.v, r4(0) # load VL elements (max 8)
sv.addi r20.v, r20.v, 55 # add 55 to all vector
sv.st r20.v, r4(0) # store VL elements
add r4, r4, r5 # move r4 pointer forward
sub. r3, r3, r5 # decrement total count by VL
bnz loop
this will always do 8 elements at a time until r3 drops below 8.
VerticalFirst you insert a *second inner loop* with an svstep instruction just before the bnz but also, at the moment, rather than just setmaxvli 8 is is:
setmaxvvlandvfhint 8, 2 # MVL=8, VFHint=2
if the hardware *chooses* to set VFHint=2, there we will always have 2 elements at a time in the inner loop, until srcstep reaches VL
setmaxvvlandvfhint 8, 2 # MVL=8, VFHint=2
loop:
setvl r5, r3 # VL=r5=MAX(MVL, r3)
loopinner:
sv.ld r20.v, r4(0) # load VLhint elements (max 2)
sv.addi r20.v, r20.v, 55 # add 55 to 2 elements
sv.st r20.v, r4(0) # store VLhint elements
svstep. # srcstep += VLhint
bnz loopinner # repeat until srcstep=VL
# now done VL elements, move to next batch
add r4, r4, r5 # move r4 pointer forward
sub. r3, r3, r5 # decrement total count by VL
bnz loop
the question is, then: can we get rid of the inner loop? and if we do can anything useful be done?
i have a feeling, looking at this assembler, that VLhint genuinely serves a different purpose *in addition* to VL and MAXVL.
(btw aside: svstep+bnz was why i wanted a step-and-test branch conditional instruction but it's too CISC)
l.
More information about the Libre-soc-dev
mailing list