-
Notifications
You must be signed in to change notification settings - Fork 798
Closed
Labels
Description
Describe the bug
The -ffast-math switch results in backend failures and/or llvm link failures when using some double precision std::math intrinsics.
This was discovered compiling LAMMPS with Kokkos using SYCL for CUDA.
To Reproduce
ogin29:jack$ cat exp.cpp
#include <iostream>
#include <cstdlib>
#include <sycl/sycl.hpp>
using namespace sycl;
#include <cmath>
// Create an exception handler for asynchronous SYCL exceptions
static auto exception_handler = [](sycl::exception_list e_list) {
for (std::exception_ptr const &e : e_list) {
try {
std::rethrow_exception(e);
}
catch (std::exception const &e) {
#if _DEBUG
std::cout << "Failure" << std::endl;
#endif
std::terminate();
}
}
};
int
main()
{
auto d_selector{default_selector_v};
const int N = 1024;
std::vector<double> in(N), out(N);
std::srand(1234);
for (int i = 0; i < N; ++i)
in[i] = std::rand() / (double) RAND_MAX;
double *d_in = in.data(), *d_out = out.data();
queue q(d_selector, exception_handler);
range num_items{N};
auto e =
q.parallel_for(num_items, [=](auto i) {
//d_out[i] = std::exp(std::sin(d_in[i]) + std::cos(d_in[i]));
d_out[i] = std::sin(d_in[i]) + std::cos(d_in[i]);
});
e.wait();
std::cout << out[0];
}
=====
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --cuda-gpu-arch=sm_80 --cuda-path=$CUDATOOLKIT_HOME exp.cpp -o exp -ffast-math
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
fatal error: error in backend: Cannot select: t11: f64 = fsin nnan ninf nsz arcp contract afn reassoc t10
t10: f64,ch = load<(load (s64) from %ir.arrayidx.i.i, !tbaa !65, addrspace 1)> t0, t7, undef:i64
t7: i64 = add t2, t6
t2: i64,ch = CopyFromReg t0, Register:i64 %1
t1: i64 = Register %1
t6: i64 = shl t4, Constant:i32<3>
t4: i64,ch = CopyFromReg t0, Register:i64 %2
t3: i64 = Register %2
t19: i32 = Constant<3>
t9: i64 = undef
In function: _ZTSN4sycl3_V16detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1EZ4mainEUlT_E_EE
llvm-foreach:
clang-16: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 16.0.0 (https://github.com/intel/llvm d164fd946341dba28d6759aa2938161ce0e83647)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /pscratch/sd/l/lfmeadow/llvm-build/install/bin
clang-16: note: diagnostic msg: Error generating preprocessed source(s).
If std::exp is called then a different message appears.
Environment (please complete the following information):
linux nvidia A100 CUDATOOLKIT_HOME=/opt/nvidia/hpc_sdk/Linux_x86_64/22.5/cuda/11.7 on Perlmutter.
Additional context
Add any other context about the problem here.