You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been working with SimpleFFT for a while now in a project, and noticed some details I've modified in our local version.
Most importantly, the use of "#pragma omp parallel for" involves some overhead in setting up a multi-threaded context and passing value ranges to threads. I've never had good results with this in innermost loops, the way it's being used right now in SimpleFFT.
The alternative "#pragma omp simd" is more than likely a suitable alternative. Rather than setting up a multithreaded context, simd will use common vectorization constructs to optimize the loop, leaving the multithreading (and its overhead) to outer contexts instead.
With this change, I've had positive results enabling the flag in clang as well - noticed there was an ifdef disabling it for that environment.
The text was updated successfully, but these errors were encountered:
No worries, I mostly wanted to make sure the issie is in here and not just
in my private copy. :)
Den mån 7 dec. 2020 19:19Dmitry Ivanov <[email protected]> skrev:
I've been working with SimpleFFT for a while now in a project, and noticed some details I've modified in our local version.
Most importantly, the use of "#pragma omp parallel for" involves some overhead in setting up a multi-threaded context and passing value ranges to threads. I've never had good results with this in innermost loops, the way it's being used right now in SimpleFFT.
The alternative "#pragma omp simd" is more than likely a suitable alternative. Rather than setting up a multithreaded context, simd will use common vectorization constructs to optimize the loop, leaving the multithreading (and its overhead) to outer contexts instead.
With this change, I've had positive results enabling the flag in clang as well - noticed there was an ifdef disabling it for that environment.
The text was updated successfully, but these errors were encountered: