[Libre-soc-dev] [OpenPOWER-HDL-Cores] microwatt dcache potential bug (overlap r0 and r1)

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sat Jan 15 11:32:58 GMT 2022

crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Sat, Jan 15, 2022 at 7:25 AM Paul Mackerras <paulus at ozlabs.org> wrote:

> On Fri, Jan 14, 2022 at 11:25:02PM +0000, Luke Kenneth Casson Leighton wrote:
> >                         if req.valid = '1' and req.same_tag = '1' and
> >                             ((r1.dcbz = '1' and req.dcbz = '1') or
> >                              (r1.dcbz = '0' and req.op = OP_LOAD_MISS)) and
> >                             r1.store_row = get_row(req.real_addr) then
> >                             r1.full <= '0';
> >
> > and it *overwrites* r1.full back to zero.
> Ummm, it shouldn't,

yeah, it's odd :)

oh hang on: in the latest dcache.vhdl there is this:

                        if r1.full = '1' and r1.req.same_tag = '1' and
                            ((r1.dcbz = '1' and req.dcbz = '1') or
r1.req.op = OP_LOAD_MISS) and
                            r1.store_row = get_row(r1.req.real_addr) then

notice how that's testing "if r1.full=1" not "if req.valid"?
that miiight actually achieve the same effect, but i need
to wake up with some coffee first to assess it.

> I see this at line 1561 of dcache.py:
>                 (~r1.dcbz & (r1.req.op == Op.OP_LOAD_MISS))) &

easy to recognise as the original, isn't it? :)

oh hang on the latest dcache.vhdl is quite different.
            ((r1.dcbz = '1' and req.dcbz = '1') or
              r1.req.op = OP_LOAD_MISS) and

> Notice you have r1.req.op there whereas the VHDL has req.op.  I think
> that's your bug.  (Similarly line 1560 has r1.req.dcbz not req.dcbz,
> and line 1559 has r1.req.same_tag not req.same_tag.)

yes, i tried that a couple days ago and it resulted in data corruption
much earlier.  after a frustrating day trying to find out why i gave up on
it, i'll come back to it later because i recognise that it's part of reducing

results of the two workarounds for the two verilators sims currently
running at a mind-numbingly-fast 1,000 instructions per second, they
are both up to here:

[    0.000000] Kernel command line:
[    0.000000] Dentry cache hash table entries: 32768 (order: 6,
262144 bytes, linear)
[    0.000000] Inode-cache hash table entries: 16384 (order: 5, 131072
bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off

if that continues successfully i'll leave it alone unless i have a good
reason not to [such as latency / gate timing].

i'm estimating them to get to the boot prompt some time in the next 2
days (!) if all goes well. although i may just check that DEC is properly

also it's probably time to start checking into running on FPGAs because
this is just so ridiculously slow that champions of international paint-drying
watching competitions would be screaming and running away.

a proper DMI-dump-and-restore system i think is becoming an
increasingly high priority: the round-trip on debugging, here, is
on an O(N^2) curve if restarting every time from cold boot.


More information about the Libre-soc-dev mailing list