At the moment, C2C examples require https://github.com/mnicely/cub.
These examples utilize the following toolsets:
- cuFFT
- cuFFTDx (Requires joining CUDA Math Library Early Access Program) https://developer.nvidia.com/CUDAMathLibraryEA
- C++11
Volta+
This code runs three scenarios
- cuFFT using cudaMalloc
- cuFFT using cudaMallocManaged
- cuFFTDx using cudaMalloc
- Compare coding styles between cuFFT, using cudaMalloc and cudaMallocManaged
- Compare performance between cuFFT, using cudaMalloc and cudaMallocManaged
- Compare performance and results between cuFFT and cuFFTDx
For float
make
./cuFFT_vs_cuFFTDx
For double
export USE_DOUBLE=1
make
./cuFFT_vs_cuFFTDx
To compare results (cuFFT and cuFFTDx are not expected to be exact)
export PRINT=1
make
./cuFFT_vs_cuFFTDx
export PRINT=1
exportUSE_DOUBLE=1
make
./cuFFT_vs_cuFFTDx
FFT Size: 2048 -- Batch: 16384 -- FFT Per Block: 1 -- EPT: 16
cufftExecC2C - FFT/IFFT - Malloc XX.XX ms
cufftExecC2C - FFT/IFFT - Managed XX.XX ms
Compare results
All values match!
cufftExecC2C - FFT/IFFT - Dx XX.XX ms
Compare results
All values match!
- This code utilizes cuFFT Callbacks
- This code utilizes separate compilation and linking