[Libre-soc-bugs] [Bug 1229] fosdem2024 llvm simple-v

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sun Dec 3 09:32:33 GMT 2023


https://bugs.libre-soc.org/show_bug.cgi?id=1229

--- Comment #11 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #7)
> (In reply to Luke Kenneth Casson Leighton from comment #4)
> > (In reply to Jacob Lifshay from comment #1)
> > > some points:
> > > the register keyword won't be going away completely
> > 
> > whew
> > 
> > >
> > > register long a asm("r3");
> > > asm("addi r3, r3, 1" : "+r"(a));  // increments a
> > 
> > what i envision is that this:
> >    register long a[10] asm ("*r3")
> 
> I think we should focus on not requiring specifying exact registers, since
> imo that's like half the benefit of compilers -- you don't have to figure
> out where everything goes, since that's really hard for humans. e.g. it took
> me *hours/days* to figure out where all the registers should go for divmod,
> and that's a short function!
> 
> > 
> > literally traslates to
> >    setvl maxvl=10
> 
> I think we should avoid requiring the programmer to specify where all the
> setvls go, instead we just rely on vector types to track that information,
> and the compiler can then insert setvl instructions where necessary.

yes.  i missed out several compiler and assembly-level peephole optimisation
passes there for simplicity (at 5am)

> additionally, compiler optimizations tend to work much better when it
> doesn't have to worry about a bunch of global state (VL), and can just
> reintroduce (insert setvl ops) the global state at the end after doing the
> optimizations.

... which is where the correct design of the IR-prefix-representing-SV
comes into play but ultimately shoud define pretty much exactly the
current SVP64 SPRs.

> 
> > 
> > and marks r3 as vector, such that sv.addi works as expected.
> > following on from that...
> 
> I think it should just mark `a` as a vector, 

yes, sorry, wasn't clear, yes absolutely.

> not r3, since a has the type
> `register long[10]` which tells the compiler that it's a vector with
> MAXVL=10, 

you got it in 1.

so the compiler can assign it to any convenient spot or even
> optimize it out completely rather than being forced to keep it in r3 and key
> all scalar/vector decisions off of whether r3 is mentioned or not.

yyep.

then any loops can easily be autovectorized, vertical-first is going to
be astonishingly laughably simple to implement.  HF a little harder but
doable with the right IR passes and checking that element variables within
the loop are all 100% independent.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list