[Libre-soc-bugs] [Bug 236] Atomics Standard writeup needed

Wed Jul 27 23:05:49 BST 2022

https://bugs.libre-soc.org/show_bug.cgi?id=236

Andrey Miroshnikov <andrey at technepisteme.xyz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrey at technepisteme.xyz

--- Comment #47 from Andrey Miroshnikov <andrey at technepisteme.xyz> ---
>From the meeting:
Jacob Lifshay says:i've been working on benchmarking atomics on power9 and
comparing to x86 and armv8.1a to come up with another justification for
improving powerisa atomics 
Jacob Lifshay says:
https://git.libre-soc.org/?p=benchmarks.git;a=summary
Jacob Lifshay says:i'm borrowing my proposed implementation from risc-v where
AMOs can be sent to L1/2/3 caches adaptively, wherever the cache is shared
between the different cpus that are accessing that atomic 
Dmitry says:Speaking of C, I've always found a bit of confusion between
barriers and atomicity per se. That's probably C tends to speak of
architecture-agnostic stuff and instead talks of abstract state machine. 
Jacob Lifshay says:you do conflict detection on physical addresses, you can use
conflict detection on effictive addresses as a good way to predict physical
address conflicts 
Jacob Lifshay says:all loads/stores in in a core...*not* across all cores 
Jacob Lifshay says:the other cores have to rely on cache coherencey protocols 
Jacob Lifshay says:yes, but the fetch_add we need are the ones that can be done
in l1/l2/l3/memory, wherever is fastest 
Dmitry says:Basically all but cmpxchg (or lol/sc) are for performance reasons,
aren't they? 
Dmitry says:Because you can implement anything in terms of cmpxchg. 
Jacob Lifshay says:cmpxchg or ll/sc is slow tho.... 
Konstantinos says:might be a stupid question, but would it be possible to have
2 implementations for the same atomics? ie keep the existing L2 ones for
scalable systems and replace them with lighter implementations for CPUs that
target embedded/desktop/workstations/etc, non-multicore systems at any rate 
Jacob Lifshay says:we're keeping the existing ll/sc atomics for back compat and
because they're fully general...fetch_add instructions are useful cuz they can
run faster. 
Dmitry says:This would be strange to have different implementations across the
same arch. 
Konstantinos says:I *did* say it might be a stupid question 😃 
Jacob Lifshay says:you can't just use the existing instructions since you'd
have to combine 5-7 instructions into one fetch_add microop -- terrible 
Dmitry says:ll/sc is notoriously difficult to use, and has no direct
counterpart in memory model 
Dmitry says:IIRC even ARM got its cmpxchg recently 
Konstantinos says:most of the time it's easier to change the hardware than the
software 
Dmitry says:Yeah I agree. 
Jacob Lifshay says:uuh, ll/sc works just fine -- when you need the full
generality. it's slower otherwise 
Dmitry says:Granted that software is ready for hardware changes 
Dmitry says:😃 
Dmitry says:You can have the generality with cmpxchg. This basically boils down
to loop. 
Jacob Lifshay says:all of cmpxchg/ll/sc are often slower than a dedicated
fetch_add instruction...no loop needed 
Dmitry says:And you cannot have it with ll/sc, because it doesn't map well to
higher-level languages. 
Dmitry says:Yes that's why I started with "for performance reasons". 
Jacob Lifshay says:high-level languages nap to a ll/sc loop currently 
Jacob Lifshay says:map* 
Dmitry says:I guess some cycles can be saved with relaxed semantics 
Dmitry says:I mean C memory model 
Jacob Lifshay says:yes, but you often need
acquire/release/sequentially-consistency 
Dmitry says:That is, cmpxchg made its way to higher level 
Dmitry says:Yeah 
Dmitry says:Ok we basically flooded the chat with atomics 😃 
Jacob Lifshay says:sve2 is only scalable from the hw end, sw sees a fixed impl.
dependent size 
Jacob Lifshay says:suggested atomic implementation: 
https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/atomics/discussion.mdwn;h=050e83458bdb6727d5eb43111de71277ef46a44a;hb=HEAD
Jacob Lifshay says:paul: you wanted a description of a proposed atomics
implementation: 
https://bugs.libre-soc.org/show_bug.cgi?id=236#c46
Jacob Lifshay says:iirc aws graviton3 has sve2 support 
Jacob Lifshay says:amazon, not alibaba 
Jacob Lifshay says:for graviton3

-- 
You are receiving this mail because:
You are on the CC list for the bug.