Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This does work on ARM? #6

Open
mcourteaux opened this issue Feb 10, 2023 · 5 comments
Open

This does work on ARM? #6

mcourteaux opened this issue Feb 10, 2023 · 5 comments

Comments

@mcourteaux
Copy link

mcourteaux commented Feb 10, 2023

Reading your README says that you "need to generate a dmb instruction on ARM, which is impossible in C++11". That sounds strange to me as the C++ language memory model should always just correctly follow the "as if" rule, regardless of the hardware.

Testing this on Godbolt seems to indeed generate quite some dmb ish instructions:
https://godbolt.org/z/93oha5Mn3

So, what's going on with the claim in the README? Are you saying that this seqlock implementation is incorrect with respect to the language memory model, but just happens to work on the x86-TSO memory model?

@mcourteaux
Copy link
Author

mcourteaux commented Feb 10, 2023

Hmm, it seems that changing the -mcpu=cortex-m3 to -mcpu=cortex-a76 removes the dmb instructions...

Aha, but! I uses other instructions, such as stlr (Store-Release register) and ldar (Load-Acquire register). So it might still be a correct seqlock in the end?

@rigtorp
Copy link
Owner

rigtorp commented Feb 28, 2023 via email

@stravager
Copy link

stravager commented Mar 2, 2023

The count.fetch_add(0) recommended by that paper is actually recognized by clang - on x64 it compiles to an mfence followed by a regular load. gcc doesn't do this, however.

Best bet might be count.fetch_add(0) as a default with a manual mfence+load for platforms known to need it (e.g. gcc x64). Then at least you know the ordering will be correct even if it doesn't perform optimally everywhere.

@axelriet
Copy link

axelriet commented Mar 2, 2023

Fascinating. I’d do a couple of lines of inline asm, or use intrinsics and plain C and be sure. I know intrinsics aren’t very portable but that’s one of those cases where correctness and performance prevail over portability. Just my 0.02ct

@rigtorp
Copy link
Owner

rigtorp commented Sep 18, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants