[Libre-soc-dev] fantastically-weird regfile
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Dec 14 01:07:29 GMT 2020
On 12/14/20, Jacob Lifshay <programmerjake at gmail.com> wrote:
> That's very similar to what I already did for int and fp registers, see:
>
> https://libre-soc.org/openpower/sv/svp_rewrite/svp64/
ah i completely missed it because it's not spelled out in pseudocode
or compacted as words.
can you write it up so it can be understood and evaluated?
the reason i say that is because it has to be thought through what the
implications for the routing are, and how elwidths would work.
the CRs are only 8x32 bit and consequently the entire regfile is small
enough to get away with weirdness, where SRAMs will be flat-out not an
option. consequently at the hardware level using costly DFFs is not
such a high gate penalty.
if the same trick is tried on 64 bit registers where we have 128 of
them, not being able to use SRAMs is a massive penalty that has to be
respected and set as a limit on what can and cannot be done.
therefore if the indexing of INT and FP is not straightforward
(requires large crossbars to turn the data round *into*
straightforward linear SRAM friendly access) there needs to be a damn
good reason.
the reason for recommending doing CRs this way is because each batch
of 8 is 32 bit and that's an easy quantity to handle...
... as long as it's aligned.
the moment a CR vector read/write is nonaligned it results in a huge
explosion of Dependency Matrix entries plus a massive routing and
wiring proliferation.
32 bit reads and writes are dead simple to assign to e.g. an aligned
8x8 SIMD operation. that's one CR per byte into the predicate; it's
also when Rc=1 a dead straight 32 bit CR write.
if on the other hand the 3 bit CR reads are misaligned (starting at
CR6 for example) that involves 2x CR 32 bit reads, shift-and-mask
operations jusylt like in nonaligned LD/ST and i do not think it wise
to start down that route.
l.
More information about the Libre-soc-dev
mailing list