Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(add-carry): with Clang on non-x86 (for example MacOS) use builtin add-carry instead of u128 #411

Merged
merged 1 commit into from
Jan 4, 2025

Conversation

mratsim
Copy link
Owner

@mratsim mratsim commented Jun 28, 2024

Potentially a quick win to accelerate Constantine on Apple M1~3.

See tracking compiler inefficiencies #357

Use addcarry builtin for Clang (GCC is incredibad) on non-x86 platforms.

To be benched with CTT_ASM=0 CC=clang nimble bench_fp.

@mratsim
Copy link
Owner Author

mratsim commented Jan 4, 2025

On a Apple M4 Max, this is a 10% perf improvement for scalar multiplication.
image

@mratsim mratsim merged commit 64ae9c8 into master Jan 4, 2025
24 checks passed
@mratsim mratsim deleted the clang-addcarry-builtin branch January 4, 2025 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant