[Libre-soc-bugs] [Bug 251] Initial 3D MESA non-accelerated software-only driver is needed

Fri Jul 2 12:47:33 BST 2021

https://bugs.libre-soc.org/show_bug.cgi?id=251

--- Comment #61 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to vivekvpandya from comment #60)

> Again may be some low hanging fruits like
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4385

interesting, a parallel addition.

> > that said if you've managed to get as far as you have, and opened up
> > a new development path with it, that's fantastic, and a good call.
> 
> This particular path can get us started on LLVM side. I am thinking to
> modify LLVM power-pc backend which will run a simple vectorizer pass (just
> before Global ISEL)and create libre-soc specific LLVM intrinsics (can be
> added with TD) and then updated Global Isel to generate libre-soc's textual
> assembly for newly added llvm instrinsics. 

great idea.  if that can be done as stand-alone programs (not needing
the entirety of mesa) that would be particularly good, or, more to the
point, if the *assembly* can be generated stand-alone that's really
good.

then it can be run through the python-based simulator, which can do around
2,500 instructions per second on high-end hardware, and that's perfect for
running short programs.

in parallel with that, a c-based simulator can be written, which can do
100,000 instructions per second even on low-end hardware.

btw a word of caution: if expanded out to a single 1D linear sequence
there are over a QUARTER OF A MILLION possibly even HALF A MILLION
instructions in SVP64.

let the implications sink in for a minute.

LLVM Vector ISA support will have assumed that there are a limited number
of instructions, possibly as many as 1,000, maybe even 10,000 when all
intrinsics are permuted out, so it is "perfectly fine" to have programs
which auto-generate all possible IR combinations.

for SVP64 this approach would create multi-megabyte IR files.
this is down to the fact that there is a 32-bit opcode space *MULTIPLIED*
by a 24-bit prefix.

it would be much, much better if LLVM's IR was designed around the 2D
{prefix}{suffix} concept rather than the 1D {prefix * suffix} space.

but, we work with what we've got.

-- 
You are receiving this mail because:
You are on the CC list for the bug.