[Libre-soc-isa] [Bug 560] big-endian little-endian SV regfile layout idea

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Tue Jan 5 03:49:09 GMT 2021


https://bugs.libre-soc.org/show_bug.cgi?id=560

--- Comment #63 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Jacob Lifshay from comment #62)
> (In reply to Luke Kenneth Casson Leighton from comment #56)
> > (In reply to Jacob Lifshay from comment #55)
> > > (In reply to Luke Kenneth Casson Leighton from comment #53)
> > > > (In reply to Alexandre Oliva from comment #50)
> > > > > do you see now why it doesn't make sense that the conversion from BE to LE
> > > > > (or vice versa) places the MSByte in the LSByte?
> > > > 
> > > > deep breath: it doesn't matter if it "makes sense", it's what the actual
> > > > code - the simulator, the HDL of microwatt and the HDL of Libre-SOC -
> > > > actually do.
> > > 
> > > Yes, and Alexandre and I are saying that the CPU should be changed for the
> > > software reasons explained previously:
> > 
> > the hardware is already so complex and this introduces another dimension of
> > complexity that it is going to be one of those things that is simply not a
> > good idea to continue discussing for implementation at this time.
> > 
> > changes involving the register files are, as you have noted many times
> > already, where i put my foot down and say "no" due to the inherent
> > complexity involved in even beginning to assess, starting from the
> > discussion and escalating from there.
> 
> Note that the HW implementation I proposed in comment #55 would require a
> 5-input mux on ALU pipelines inputs/outputs and a 2-input mux on register
> R/W ports.

I think the 5-input mux is a far cry from the "full 8-in 8-out crossbar" that
you were afraid of needing. it would be 6*64 gates (4 2-in NAND and 1 5-in
NAND) per 64-bit input/output with a 2 gate delay. I'd expect that to be small
enough to be doable.

> Using a slight variation (busses and result latches always in LE, ALUs
> byteswap differently to compensate) of that design that I haven't completely
> thought through, it can be reduced to a 4-input mux on ALU pipelines
> inputs/outputs and no mux on the registers.

Note that, in the slight variation mentioned above, anything that does
operations at something other than 64-bit element size (mostly just
partitionable ALUs) would need the ALU byteswapping, pipelines that only
operate at 64-bit element size are like the register ports and wouldn't require
any byteswapping.

it would be 5*64 gates (3 2-in NAND and 1 4-in NAND) per 64-bit input/output
with a 2 gate delay. I'd expect that to be small enough to be easily doable.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list