[Libre-soc-bugs] [Bug 236] Atomics Standard writeup needed

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sat Jul 16 12:32:50 BST 2022


https://bugs.libre-soc.org/show_bug.cgi?id=236

--- Comment #35 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #34)

> added initial atomics assembler, along with script that generates it.

which i've now had to remove and deal with yet another force-master push.

please *do not* break the hard rule of adding auto-generated output to
repositories.

*especially* given that it is a massive batch of identical code.


allow me to be clearer in the instructions.

we need to demonstrate that the POWER9 recommended c++ spinlocks and
atomics are or are not efficient, and to what extent.

the program therefore that needs to be created must:

1) have an option to specify the number of SMT forked processes to run
2) have an option to specify how many lock and unlocks shall be performed
   per forked process
3) have an option to specify the range of memory addresses to be lock/unlocked
   ("1" being "all processes lock and unlock the same address)
4) use RECOMMENDED sequences known to be used in c, c++, and the linux
   kernel. such as these (or other already present in the linux kernel
   and other common libraries)
   http://www.rdrop.com/~paulmck/scalability/paper/N2745r.2011.03.04a.html
5) have an option to use the "eh" hints that Paul mentioned are in
   Power ISA 3.1 p1077 eh hint
6) time the operations ("time ./locktest" would do).

there is no need to add this program in markdown form.

it is purely experimental in nature for the purposes of research.

it is not for the publication of a specification.

it is for the purposes of actually being executed to obtain
information for which a report (manually) will have to be written.

when executed on the TALOS-II workstation with different numbers of
processes and different memory ranges, this will tell us whether
IBM properly designed the ISA or not.  it will not tell us exactly
*how* they actually implemented them but will give at least some
black-box hints.

if the locking remains linear for up to all 72 hyperthreads and it
is of the order of a million locks per second per core regardless
of the number of memory addresses then we can reasonably conclude
that they did a damn good job.

if they do *not* work then we are 100% justified in proposing additional
enhancements to the ISA

but *not* until the *actual* statistics have *actually* been measured
and real-world reports obtained.

we do not have access to an IBM POWER10 system so IBM POWER9 will have to do.

bottom line is that if we cannot demonstrate good knowledge of IBM's
atomics then we have absolutely no business whatsoever in proposing
alternatives or enhancements.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list