[Libre-soc-bugs] [Bug 236] Atomics Standard writeup needed
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Wed Jul 27 23:05:49 BST 2022
https://bugs.libre-soc.org/show_bug.cgi?id=236
Andrey Miroshnikov <andrey at technepisteme.xyz> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |andrey at technepisteme.xyz
--- Comment #47 from Andrey Miroshnikov <andrey at technepisteme.xyz> ---
>From the meeting:
Jacob Lifshay says:i've been working on benchmarking atomics on power9 and
comparing to x86 and armv8.1a to come up with another justification for
improving powerisa atomics
Jacob Lifshay says:
https://git.libre-soc.org/?p=benchmarks.git;a=summary
Jacob Lifshay says:i'm borrowing my proposed implementation from risc-v where
AMOs can be sent to L1/2/3 caches adaptively, wherever the cache is shared
between the different cpus that are accessing that atomic
Dmitry says:Speaking of C, I've always found a bit of confusion between
barriers and atomicity per se. That's probably C tends to speak of
architecture-agnostic stuff and instead talks of abstract state machine.
Jacob Lifshay says:you do conflict detection on physical addresses, you can use
conflict detection on effictive addresses as a good way to predict physical
address conflicts
Jacob Lifshay says:all loads/stores in in a core...*not* across all cores
Jacob Lifshay says:the other cores have to rely on cache coherencey protocols
Jacob Lifshay says:yes, but the fetch_add we need are the ones that can be done
in l1/l2/l3/memory, wherever is fastest
Dmitry says:Basically all but cmpxchg (or lol/sc) are for performance reasons,
aren't they?
Dmitry says:Because you can implement anything in terms of cmpxchg.
Jacob Lifshay says:cmpxchg or ll/sc is slow tho....
Konstantinos says:might be a stupid question, but would it be possible to have
2 implementations for the same atomics? ie keep the existing L2 ones for
scalable systems and replace them with lighter implementations for CPUs that
target embedded/desktop/workstations/etc, non-multicore systems at any rate
Jacob Lifshay says:we're keeping the existing ll/sc atomics for back compat and
because they're fully general...fetch_add instructions are useful cuz they can
run faster.
Dmitry says:This would be strange to have different implementations across the
same arch.
Konstantinos says:I *did* say it might be a stupid question 😃
Jacob Lifshay says:you can't just use the existing instructions since you'd
have to combine 5-7 instructions into one fetch_add microop -- terrible
Dmitry says:ll/sc is notoriously difficult to use, and has no direct
counterpart in memory model
Dmitry says:IIRC even ARM got its cmpxchg recently
Konstantinos says:most of the time it's easier to change the hardware than the
software
Dmitry says:Yeah I agree.
Jacob Lifshay says:uuh, ll/sc works just fine -- when you need the full
generality. it's slower otherwise
Dmitry says:Granted that software is ready for hardware changes
Dmitry says:😃
Dmitry says:You can have the generality with cmpxchg. This basically boils down
to loop.
Jacob Lifshay says:all of cmpxchg/ll/sc are often slower than a dedicated
fetch_add instruction...no loop needed
Dmitry says:And you cannot have it with ll/sc, because it doesn't map well to
higher-level languages.
Dmitry says:Yes that's why I started with "for performance reasons".
Jacob Lifshay says:high-level languages nap to a ll/sc loop currently
Jacob Lifshay says:map*
Dmitry says:I guess some cycles can be saved with relaxed semantics
Dmitry says:I mean C memory model
Jacob Lifshay says:yes, but you often need
acquire/release/sequentially-consistency
Dmitry says:That is, cmpxchg made its way to higher level
Dmitry says:Yeah
Dmitry says:Ok we basically flooded the chat with atomics 😃
Jacob Lifshay says:sve2 is only scalable from the hw end, sw sees a fixed impl.
dependent size
Jacob Lifshay says:suggested atomic implementation:
https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/atomics/discussion.mdwn;h=050e83458bdb6727d5eb43111de71277ef46a44a;hb=HEAD
Jacob Lifshay says:paul: you wanted a description of a proposed atomics
implementation:
https://bugs.libre-soc.org/show_bug.cgi?id=236#c46
Jacob Lifshay says:iirc aws graviton3 has sve2 support
Jacob Lifshay says:amazon, not alibaba
Jacob Lifshay says:for graviton3
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the libre-soc-bugs
mailing list