[Libre-soc-dev] microwatt dcache potential bug (overlap r0 and r1)
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Thu Jan 13 14:17:30 GMT 2022
hi folks, got a potential issue with microwatt dcache.vhdl on the r0_full
and r1.full signals. i outlined on the mattermost chat that for the libresoc
core we're feeding dcache from wishbone with fully-pipelined LDSTs,
whereas most operation of microwatt is from a wishbone-classic
source that has an alternating ack-stall. thus, this bug, which is
quite obscure, has not been encountered. it *may* be why microwatt
is seeing corrupted LDs, occasionally.
the issue is that r1.full is set to zero whilst there are still ACKs
outstanding: both RELOAD_WAIT_ACK and STORE_WAIT_ACK.
i suspect that the occasional LD corruption is down to r1.full
being set to zero, a batch of ACKs still being expected, but
a *new* r0 LD operation comes in, gets transferred to r1 *whilst
there are still ACKs pending from the previous LD*.
in other words, dcache should be indicating that it is busy, but
is failing to do so correctly, and the only reason everything [semi-]
works is because LD/STs - including from the MMU - come in
at a sufficiently slow rate as not to trigger this bug [all the time].
the small differences between libresoc and microwatt however
mean that it *does* get triggered much more regularly.
[bear in mind i am primarily focussed on fixing Libre-SOC, the fact that
we copied dcache.vhdl near-verbatim to create dcache.py makes this
a common issue.]
i at first assumed that it would be perfectly reasonable to move the
setting of r1.full to zero into the point where the state is set to IDLE
for *both* RELOAD_WAIT_ACK *and* STORE_WAIT_ACK but this
is not the case: doing that in dcache.py results in an instance where
acks_pending misses one ack, and it all goes to hell. i don't know
the dcache FSM well enough, despite looking at it for months,
to sort that myself.
i am currently doing two things:
1) running linux under verilator with the attached patch to *microwatt*
this should take about 15 mins to get to the same failure point
(seems ok so far), but bear in mind i cannot do significant
stress-testing (it's just too slow under verilator)
2) running linux under verilator with a similar patch to *libresoc*
(dcache.py) i.e. *only* delaying clearing r1.full on RELOAD_WAIT_ACK
until the IDLE point.
i hope to have a better handle on this within a couple of days,
and more information.
More information about the Libre-soc-dev