Stencil Construction Profiling - Speed up with GPU? #50
JanGaertner
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Profiling Stencil Collection
The stencil collection process during the construction of the WENOBase class is quite expensive and takes a lot of time. Therefore, to improve start-up times the construction of the WENOBase class is profiled with the AMDuProf version 3.5.671.0 in single core on a AMD Ryzen 5 PRO 3500U.
The hottest function is the calculation of the Gauss quadrature in
WENOEXT/libWENOEXT/WENOBase/geometryWENO/geometryWENO.C
Line 346 in f45593a
Even though it is already attempted to improve performance using manual AVX instructions, the calculation of the power still takes a long time. That the vector operations are the ones taking up most of the time can also be seen in the assembly code analysis of the profiler.
Possibly the calculation of the quadrature can be reformulated as a single matrix multiplication or as multiple matrix multiplication which could be executed on a GPU to improve performance.
Speed up with GPU
Beta Was this translation helpful? Give feedback.
All reactions