[Libre-soc-dev] 3D MESA Driver
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Aug 10 16:56:28 BST 2020
On Mon, Aug 10, 2020 at 4:28 PM Hendrik Boom <hendrik at topoi.pooq.com> wrote:
> I suppose it will still be possible to divide the screen into tiles, and
> have a separate C(G)PU do the graphics in each tile -- for further
exactly. just like in Larrabee. except with hardware-optimised
instructions [Larrabee was a software-exclusive experiment. made a
great general-purpose "compute" engine, but nowhere near a good enough
- competitive - 3D GPU]
> This would likely be accomplished entirely in software.
now, we *may* need a special area of memory for the tiles. this would
be a special, small, protected resource, perhaps with Z-Buffer
capability. we just have to see how it goes. if the performance
isn't good enough, then (following Jeff Bush's analysis techniques) if
it turns out to be a high-reward target we provide a special tile
area, and associated instructions.
> Would this involve expensive data transfer between CPU's, which we are
> trying to avoid by merging the CPU with the GPU?
a lot of the software *complexity* in "normal" 3D GPU drivers is
because that "driver singular" is not just one binary executable or
library, it is a dog's dinner mess involving:
* userspace application
* proprietary userspace library which contains
* shader compiler and
* communications and marshalling/unmarshalling "shim" library to
* kernelspace passes packed (shared memory) objects over to a separate
GPU using PCIe or other method and
* GPU unpacks the data and the shader binary and
* executes it whilst the CPU waits for the results and
* CPU receives notification in kernelspace of completion and
* context-switches back to the userspace application which
* continues on its path.
.... anyone think this is sane? anyone?
normally, the tiling area would be part of the GPU: the CPU would
never, under any circumstances, get access to it - or even know it was
there. those tiles would be copied directly out to the framebuffer by
a DMA engine (or straight memcpy) done on the *GPU*.
however with a hybrid CPU-GPU it's done using CPU instructions, or CPU
DMA, and CPU memory locking, and, crucially, it's done in *userspace*
- as part of a *userspace* application - not kernelspace.
More information about the Libre-soc-dev