Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling failed in the environment of (cuda11.3, arch_sm=86) #83

Open
henry123-boy opened this issue Jul 3, 2023 · 2 comments
Open

Comments

@henry123-boy
Copy link

Hi! Thank you for your excellent works and codes, but recently I confronted some compiling problems in my server whose environment is cuda11.3 and arch_sm=86.
The issues are reported as below:
"ptxas /tmp/tmpxft_0006c5ba_00000000-6_block6x6_pcg_weber.ptx, line 4136; error : Instruction 'shfl' without '.sync' is not supported on .target sm_70 and higher from PTX ISA version 6.4"
wish to get reply ~

@henry123-boy
Copy link
Author

henry123-boy commented Jul 3, 2023

Update:
I have solved this problem by replace

asm volatile (
        "{.reg .f32 r0;"
                ".reg .pred p;"
                "shfl.up.b32 r0|p, %1, %2, 0;"
                "@p add.f32 r0, r0, %1;"
                "mov.f32 %0, r0;}"
        : "=f"(result) : "f"(x), "r"(offset));

by

        asm volatile (
        "{.reg .f32 r0;"
                ".reg .pred p;"
                "shfl.sync.up.b32 r0|p, %1, %2, 0, -1;"
                "@p add.f32 r0, r0, %1;"
                "mov.f32 %0, r0;}"
        : "=f"(result) : "f"(x), "r"(offset));

where the shfl instruction is not supported by PTX higher than 6.4

@ghost
Copy link

ghost commented May 15, 2024

Hi, now I am confronting the same problem, could you give me some instruction where to modify the inline assembly code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant