[Libre-soc-dev] microwatt-libre-soc interoperable verilator snapshots / debugging

Sun Jan 9 18:49:51 GMT 2022

On January 9, 2022 5:54:02 PM UTC, Lauri Kasanen <cand at gmx.com> wrote:
>On Sun, 9 Jan 2022 12:08:40 +0000
>Luke Kenneth Casson Leighton <lkcl at lkcl.net> wrote:
>
>> * assume that up to a certain instruction, microwatt and libre-soc
>>   have performed identically
>> * run microwatt
>> * in verilator c++ do a **FULL** system-wide state dump.  all
>>   registers, all TLB entries, everything.
>> * terminate the microwatt simulation
>> * run libre-soc
>> * have verilator RELOAD the entire state and continue executing...
>>   ...from where MICROWATT left off.
>
>Why mix microwatt saves at all? It sounds like an unnecessary
>complication, requiring many changes to be able to mix and match saves.

not if they are 100% compatible because it is [compatible] register/memory state being saved not [internal, implementation-specific] architectural state.

also as i said: even making one change to the HDL makes verilator state incompatible with verilator state.

by saving the register/memory state you can fix an HDL problem then restart from just before the problem and carry on.

>That is: if microwatt and libresoc perform identically until instr X,
>then just save libresoc state at instr X. When you fix that issue and
>hit the next, load that save, and do a new save at the next divergence
>point.

the reason is because at the moment it takes 10x longer to get to the same point.

it would even be possible to run microwatt in an FPGA, stop, save state, then run verilator libresoc and restore, which is a huge speedup even just for microwatt.

under verilator with microwatt running at around 10,000 instructions per second it took SIX HOURS to get to the boot prompt.

with libresoc using TestIssuer that would take SIXTY hours.

by contrast running microwatt from an FPGA would take under 30 seconds, followed by a memory dump (probably 10 minutes), and running from there would be about 2-3 minutes under verilator to the debug/failure point to be tested.

this is about saving huge amounts of time in debugging where each trip is getting longer and longer.

last week the first failure was so early it was 10 minutes to run verilator

a few days ago it was one hour.

now it is 3 hours to get to the 10,000,000 instructions executed point

tomorrow it will be 6 hours and we are talking only the first 20 lines of the linux kernel boot log messages!

l.