[Libre-soc-dev] Hardware-accelerated specialized instructions

Fri Dec 11 00:14:06 GMT 2020

On 12/11/20, Cole Poirier <colepoirier at gmail.com> wrote:

> Right. Which comes back to the core of why we are using nmigen, being
> able to change the quantity of each FU by simply changing one line of
> a python dictionary instead of writing 1000s of lines of HDL everytime
> we want to experiment with FU allocation. Right?

that simply increases or decreases the amount of FUs.  it does nothing
to change the effectiveness of specialist hardware units compared to
general purpose CPU units.

specialised units are always more power efficient.  they are also...
specialised i.e. useless for general purpose.

you make a hardware triangle fill unit like Jeff did, it's not going
to do multiplication, is it?

so the RISC microcoding approach tries to get half way. VSX has
Rijndael 128 bit sub functions which, unfortunately, are great at 128
bit AES and useless at 256 bit AES.

today Lauri suggested adding cumulative vector add and multiply.  this
would be better done using SV as it stands and perform vectorised
mapreduce with an O(N log N) completion time.

it is all a compromise and we have no idea where the line will be
drawn: merely that there exists a process, outlined by Jeff, by which
we will be able *to* decide where that line is.

l.