[Libre-soc-dev] video assembler
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon May 10 19:59:28 BST 2021
On Monday, May 10, 2021, Jacob Lifshay <programmerjake at gmail.com> wrote:
> > the python one is cycle accurate one-for-one.
> I'll note here that what lkcl probably means by cycle-accurate may be
> different than expected: luke uses the definition where it can count the
> number of executed instructions (or maybe the number of vector elements),
> ascribing 1 cycle to each executed instruction.
the definition i use is the one used by academics to describe spike. it
executes in "program order", and, crucially, *inside spike* a breakpoint
may be placed at the "instruction step" point.
when that breakpoint is single-stepped across the execution of one single
instruction, the source code of the simulator *guarantees* that the state
is absolute 100% "cycle accurate".
thus, it is dead easy to debug new instructions, which is critically
important for research and development, because of this 100% guarantee.
gem5 also follows this design principle and may be called a cycle accurate
by contrast, qemu if a breakpoint is placed *on qemu itself* (not, "placed
on the emulated program" via gdb remote protocol), placed on the internal
loop of *qemu*, it is flat-out absolutely impossible to debug what the hell
is going on because qemu JIT'd the emulated binary.
thus, qemu *IS NOT* a cycle-accurate simulator (and cannot be used without
serious difficulty for the basis of research and development)
this definition does not have anything to do with hardware, for which
"cycle accurate" has completely different interpretation, conflated as it
is with "pipeline accurate" or "sys_clk tick accurate"
pipeline accurate or sys_clk tick accurate as i discovered when trying to
debug microwatt is a bloody nuisance: a single step clock cycle *does not*
mean that a single *instruction* has been executed, nor that its results
are even close to coming down pipelines to hit regfiles. this may taje
several more clock cycles.
in fact, in microwatt, a single tick results in a cascade of cache loads
for instructions or data that have absolutely nothing to do with the
current instruction: this was very annoying to filter out!
basically, cycle accurate when in the context of simulator discussions
means *program order* instruction accurate, it does *not* mean *hardware
clock cycle tick* accurate.
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the Libre-soc-dev