[libre-riscv-dev] GPU design

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sun Dec 9 04:35:16 GMT 2018


On Sun, Dec 9, 2018 at 3:38 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> Luke,
>
> What do you think of me building a proof of concept register allocator for
> my idea for sharing the int and fp register files? I'd estimate that it
> would take about 2-4 days of work.

 sure... however please note below: modifying a hardware design and
expecting the compiler to "sort out the mess" is a sure-fire
guaranteed way to kill a project.  i've learned of *another* project
that did this:

 https://groups.google.com/d/msg/comp.arch/w5fUBkrcw-s/BMGz1HoUCAAJ

 others include the Aspex Semiconductors Array-String Processor, which
had one of the worst productivity rates i've ever encountered: DAYS
per line of ASSEMBLY code.  i worked for aspex as a FAE for six
months.


> If it works well, we could reuse the architecture in the actual llvm
> backend when we get around to writing that.

 that's exactly what concerns me.  without the compiler work being
done *as well* we have absolutely no sure-fire guaranteed way to know
if the idea will be successful... or not.

 consequently, it's extremely risky.  and, more than that, there's
alternative (non-risky) options that are much more "standard".

 also, we cannot just assume that llvm will be the only compiler.  we
need gcc as well.


> I'm assuming that register allocation is what you think will be most of the
> compiler problems that you're nervous about.

 it's a huge number of things:

* the shared workload, between VPU and GPU
* the needs of the GPU for using the integer file (now reduced in
size) for storing pixels
* that scalar RV hasn't done it
* that the compiler will need to generate "if else" blocks and/or
function calls on critical loop setup/teardown to dynamically cope
with runtime register allocation
* several other issues which i suspect will crop up as well

honestly, i feel it's one of those nightmare areas that will take
several months to _retrospectively_ have worked out that it wasn't a
good idea.

and, with the exploration of the CDC 6600 with mitch alsup's help,
register renaming can be done by adding one- two- or three- entry
register queues at the front of each functional unit.

basically, the need for a merged register file is moot, it's hugely
problematic, it'll take a *long* time to work out *that* it was
problematic.

please please think about that before committing the time.  can you do
the hardware *and* the compiler (both gcc and llvm) in the same 2-4
days?  is it possible to reduce the feedback loop latency in *any*
way?

l.



More information about the libre-riscv-dev mailing list