[Libre-soc-dev] load/store quad and svp64

Luke Kenneth Casson Leighton lkcl at lkcl.net
Tue Apr 12 16:51:44 BST 2022

crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Tue, Apr 12, 2022 at 4:36 PM Jacob Lifshay <programmerjake at gmail.com> wrote:
> On Tue, Apr 12, 2022, 03:07 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
> > additionally, wishbone is simply not capable of handling greater than
> > 64-bit
> > data buses, so we would be forced to implement WB burst-mode right the way
> > through the entire codebase down to the DRAM.
> >
> no, you aren't...all you need for 128-bit atomicity is to have cpus
> coordinate such that one 128-bit atomic is atomic *only relative to other
> cpus' 128-bit (and shorter) operations on the same memory block*,

that means cache coherency, which is a pig on its own.  ariane
actually implemented the atomic operations (QTY 1) in the L2 Cache (!)
and set up a special

> > saying "just" implement lq etc is basically about FIVE months of work.
> >
> i think that may be overestimating quite a bit...it should be much easier
> once we have a working cache-coherency protocol -- which we need anyway for
> multi-core.

2-core SMP is almost done in microwatt due to the addition of
cache "snoop" capability (external cache-line invalidation).
problem is, it's single-cycle, hence the need for stalling
(global hardware spin-lock) to prevent one CPU writing
QTY 2of 64-bit writes to its cache line(s) in 2+ cycles
whilst another CPU writes to *its* same cache line for
one of the same 64-bit words.

microwatt is write-thru cache hence why i said about needing
to do the 64-bit writes down through the Wishbone Bus.


More information about the Libre-soc-dev mailing list