[Libre-soc-dev] 3D MESA Driver

Luke Kenneth Casson Leighton lkcl at lkcl.net
Mon Aug 10 17:29:19 BST 2020


On Mon, Aug 10, 2020 at 5:15 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Mon, Aug 10, 2020, 08:57 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
> > normally, the tiling area would be part of the GPU: the CPU would
> > never, under any circumstances, get access to it - or even know it was
> > there.  those tiles would be copied directly out to the framebuffer by
> > a DMA engine (or straight memcpy) done on the *GPU*.
> >
>
> Actually, for a lot of GPUs, they render directly to the frame buffer,

ah i think i was thinking  of the Broadcom VideoCore IV which if
memory serves correctly has a tile (or z?) buffer SRAM, which it
*then* writes out (directly) to the framebuffer)

> then
> the video scan out hardware is told to page flip to the just rendered frame
> buffer.

(double buffering).  yes, Richard Herveille's RGBTTL HDL has a double
buffering capability, which we can use for this purpose.


> My current plan is to render to the in-L1 cached section of the frame
> buffer, so all that's needed to scan out is just to ensure the scan-out
> hardware's view of memory is up to date, either by cache flush or by using
> a coherent memory system which allows the scan-out hardware to read from
> the cache or provoke the cache to write to memory when the accessed
> addresses are cached in a modified state.

POWER9 has an iwsync opcode, so it is perfectly reasonable to do.

> >
> > however with a hybrid CPU-GPU it's done using CPU instructions, or CPU
> > DMA, and CPU memory locking, and, crucially, it's done in *userspace*
> > - as part of a *userspace* application - not kernelspace.
> >
>
> The kernel would be involved in setting up DMA if we go that route, which I
> don't think is necessary for memory-memory copy.

true... yes, if we used DMA, you don't let userspace have direct
access to that in normal UNIX OSes.

> it would also schedule the page flip for the video scan out hardware using
> the standard DRM/KMS API. This should be fast enough as it involves writing
> a few registers for each page flip.

indeed.

l.



More information about the Libre-soc-dev mailing list