[Libre-soc-dev] Hardware-accelerated specialized instructions

Fri Dec 11 00:27:48 GMT 2020

On Thu, Dec 10, 2020 at 4:14 PM Luke Kenneth Casson Leighton
<lkcl at lkcl.net> wrote:
>
> that simply increases or decreases the amount of FUs.  it does nothing
> to change the effectiveness of specialist hardware units compared to
> general purpose CPU units.
>
> specialised units are always more power efficient.  they are also...
> specialised i.e. useless for general purpose.

Right. It helps determine how many specialized units are optimal as
well, because once they are designed it is easy to vary their quantity
and observe the effect.

> you make a hardware triangle fill unit like Jeff did, it's not going
> to do multiplication, is it?
>
> so the RISC microcoding approach tries to get half way. VSX has
> Rijndael 128 bit sub functions which, unfortunately, are great at 128
> bit AES and useless at 256 bit AES.
>
> today Lauri suggested adding cumulative vector add and multiply.  this
> would be better done using SV as it stands and perform vectorised
> mapreduce with an O(N log N) completion time.
>
> it is all a compromise and we have no idea where the line will be
> drawn: merely that there exists a process, outlined by Jeff, by which
> we will be able *to* decide where that line is.

Right. I understand the process. I'm trying to understand the
variables that can be adjusted for the experimental (in nature, not in
terms of testing, Jeff has proven it works)  process. So far it seems
that variations in the design of the specialized units is one
variable, and another variable is the quantity of specialized units,
as well as the quantity of non-specialized units.

Cole