[Libre-soc-dev] openpower-isa simulator notes

Sun May 30 18:21:16 BST 2021

boris,

some notes here which at sone point shoukd be on the wiki, outlining
how openpower-isa works

first, sections 1.6 and 1.7 was extracted into text form and made
machine-readable.

the tables showing the "Forms" actually lines up where the vertical
bars have to match the bitfield header at the top.

then, the alphabetical list AA, LK, RA etc. also is machine-readable.
here, we run into the first discrepancy: L0 L1 L2 had to be invented
(renamed from L) due to ambiguity: FormX.L might be one of *four*
different bitwidths which is clearly not acceptable for machine
readable purposes.

a python class pair called SelectableInt and FieldSelectableInt work
together to allow MSBO-ordered *ranges* of bits, from the fields
specifications, to be used as if they were contiguous numbers.

examples include SH which is non-contiguous but thanks to
FieldSelectableInt standard operators such as + and & just "work".

the performance sucks but the resultant code further down is clean and
clear. that being a much higher priority.

the markdown files are as you discovered compiled to python.  some
minor changes had to be made: an "unsigned lessthan" operator "<u" and
a few others.

the syntax was so similar to python (whitespace indentation for code
blocks) i used GardenSnake.py from python-ply and it works great. i
had to add an extra lexical pass to insert a newline token, this fixed
a longstanding bug in GardenSnake.py

the parser.py is a dog's dinner mess, but it is functional.  it makes
some assumptions which has consequences for the usage, but
fascinatingly it turns out that no pseudocode ever seen so far hits
these limitations.

for example you cannot do this:

     RT = RA

because that will assign the *variable* RA to RT (not a copy of thr
contents).  instead, this must be done:

     RT[63..0] = RA

or use a temporary.

a number of small hacks exist in order to create temporary variables
of the right bitwidth: the SelectableInt class *INSISTS* on the
bitwidth of its source operands being the same.

this makes for some interesting issues on 64 bit multiply and divide,
creating a temporary result of the correct size, zero-extending the
source operands etc.  and, also, we had to alter the pseudocode for
bpermd to specify a temporary index of exactly 7 bits wide then use
the unsigned compare operator.

you may if you have looked at the resultant python noticed that
variables (SPRs, Form-Fields e.g. D in D-Form) are used but are
neither passed in as arguments *nor referenced as self.D*

this is due to a "trick", a decorator which we called "inject".  i
found a recipe on stackoverflow which when given a dictionary
(self.namespace) will "inject" the contents *into the local function
namespace*.

welcome to one of the beautiful obscure capabilities of python.

this means, again,
 that we can keep the parser.py output reasonably clean and clear and,
crucially, NOT have to teach it about all the Form Fields, every D, K,
AA, d0, d1, sh, MB, etc etc etc etc.

caller.py is where everything is pulled together.  a class is
autogenerated (all.py) named ISACaller which inherits from caller.py
and every single autogenerated python markdown class.

this "gives" ISACaller a dictionary of functions listed by their
assembly name, so of course one of the first important tasks of the
PowerDecoder2 is to identify exactly that.  this process uses a list
"_insns" in power_enums.py so that list absolutely has to be kept
up-to-date.

PowerDecoder2 is heavily abstracted, it uses CSV files which
originally came from microwatt decode1.vhdl and it is these which, in
combination with the v1.6 and v1.7 Fields, allow us to extract the
various bitfields.

all machine-readable, there is not a single instruction in the entire
simulator that is done "by hand".  no exceptions.

in addition to this i have begun extracting the Appendix A.3
pseudocode for INT to FP operations, as "helper" routines.

again: the code exists, so why make mistakes or do extra work?

i have actually made the mistake of converting one of the FP related
pseudocode functions into python, i will revert that at some point.

back to caller.py: you can see that there are classes for SPRs, one
for registers, and another for Memory.  SPRs read their names from
*another* CSV file, this again so we do not have hardcoded names on
source code files.

recently we added the option to emulate a RADIX MMU, this uses the Mem
class as its "physical" backing store.

it's a python dictionary.  go figure.

ironically we may have to end up implementing simulated L1 caches
because Virtual Memory lookups are quite expensive in the simulator,
too, they are uncached and hit the Mem backing store multiple times.

all of the pieces of the puzzle hang together in an interdependent
fashion, and there are quite a few, however if you are familiar with
OpenPOWER which you are there really should not be any puzzlement, it
should all be "expected" if you know what i mean.

the key "why" though is that it aims for clarity and simplicity,
sacrificing speed as not even the least bit important.

supporting c or Sail will however require quite a rabbithole to be
explored, because of the accidental dependence on nmigen.  this for
another post, another time.

l.