[Libre-soc-isa] [Bug 1056] questions and feedback (v2) on OPF RFC ls010

Wed May 31 10:48:04 BST 2023

https://bugs.libre-soc.org/show_bug.cgi?id=1056

--- Comment #34 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Paul Mackerras from comment #33)
> Interesting example. I'll have to think about how I would implement that.
> 
> Ignoring BE for the moment, what kind of structure do you have in your
> design for handling this kind of source/destination width mismatch? Is it
> something like a bunch of multiplexers ahead of the ALU, or is there a more
> clever way to do it?

i'll let luke answer that, but basically we can always fall back to
element-at-a-time operation if the operation pattern is too complex to have
dedicated fast-paths for (this is probably one we'll want fast-paths for). i'm
not sure we have this specific part designed fully in hardware yet.

> 
> > hardware would be something like (assuming registers and ALUs are in LE):
> 
> (Actually, registers and ALUs don't have endianness.)

they do here, because you can access them with different element sizes, which
give different results depending on how you translate element indexes to parts
of a 64-bit word. this is exactly how memory also gains program-visible
endian-ness (ignoring how it looks from external hardware), since you can just
read the first byte of a 64-bit word and see if it was the msb or lsb byte.

> In any case, I accept that Simple-V is already so incredibly complex that
> you can't really afford the extra complexity of my idea.

yes, we already spent a lot of time going over basically the same idea before
and de-facto decided the extra hardware complexity was not worth it both
because of the extra cost and because luke couldn't hold it all in his head:
https://bugs.libre-soc.org/show_bug.cgi?id=560

> And if something
> like sv.lhz/elwidth=16 gets things into registers in the right order in BE
> mode (i.e. element-swapped compared with what a plain ld would do) then that
> answers at least one of my main problems with saying the register file is
> always LE.

yes, that's what happens afaict.

also, grev (a subset of what grevlut can do) can do any combination of any
number of bit/byte/element swaps in one instruction, so we don't need to go
through memory every time.

-- 
You are receiving this mail because:
You are on the CC list for the bug.