Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENBLAS compilation option assumes OpenBLAS installation in one location, while it can be installed to root tree #45

Open
pmantas opened this issue Dec 6, 2024 · 5 comments

Comments

@pmantas
Copy link

pmantas commented Dec 6, 2024

For OPENBLAS option OPENBLAS_INSTALL_ROOT is required form users, header file location is hardcoded as OPENBLAS_INSTALL_ROOT/lnInclude

However, if OpenBLAS is installed normally in the root file tree, location of header files can be /usr/include/openblas/

There is no way to give this directory without modifying the files make script

Quick fix can be to use both OPENBLAS_INSTALL_ROOT and OPENBLAS_INSTALL_ROOT/lnInclude as header file locations

@arintanen
Copy link
Member

arintanen commented Jan 27, 2025

I think it could be time to get rid off this manual path typing.
pkg-config could be used to set these automatically and the options file for openblas would just be:

EXE_INC = \
     $(shell pkg-config --cflags openblas)

LIB_LIBS = \ 
     $(shell pkg-config --libs openblas)

MKL and standalone would be set using mkl-dynamic-lp64-seq and lapacke, respectively.
This would be much easier from a user's perspective, since currently those MKL_ROOT and OPENBLAS_INSTALL_ROOT need to be set by the user and the location of headerfiles and libraries can sometimes be very different, which will lead to that the user needs to modify the options file manually.

What do you think @moreff @blttkgl @hamsteri15 @kahilah ?

@hamsteri15
Copy link
Contributor

Someone should probably make a detailed comparison on how much exactly do the third party blas libraries actually improve the performance. I personally have never used them. If I recall correctly, they are only used in the lu-decomposition of the jacobian. While the third-party blas implementations of lu-decomposition may have some architecture specific optimizations etc. I would still like to see some numbers how much benefit they actually bring. Note that gsl also has a reasonably nice lu-implementation (less than 100 lines) which could be used on all platforms.

I would also be slightly cautious about any assumptions made on the paths of files. It is quite common to compile third-party libraries from source (at least on clusters) and in this case I'm not sure how the package managers will help finding the paths. Certainly making any assumptions on the paths is a bad idea in general.

My suggestions are:

  1. Have a separate fallback lu-decompostion (possibly from FOAM_SRC) function which is used in case the user does not want to provide a separate linalg package.
  2. If the user explicitly says that I want to use something more exotic, make it possible to have headers/binaries in any possible location without any assumptions. I.e. have BLAS_INCLUDE/BLAS_LIB as separate parameters

In point 2 it could be possible to try to guess the paths using some elegant way but there should always be a possibility of having the paths be anything.

@arintanen
Copy link
Member

@hamsteri15
I did a small comparison a year ago. The seulex_LAPACK was much faster than the basic seulex for all mechanisms I tested, but the speed up was due to a different implementation that follows the original algorithm by Hairer see https://bugs.openfoam.org/view.php?id=2972

The pure effect of MKL was eariler actually negative for small mechanisms, since 64 bit integers were used. After chancing to 32 bit (see 2d9e235) there was no performance gain for hydrogen mechanisms, but for larger mechanisms like ammonia, it was significantly faster.

@hamsteri15
Copy link
Contributor

@arintanen I can see that for large matrices there would be some performance increase. The gsl implementation also treats small and large matrices differently. It should not be too difficult to simply take the gsl implementation and copy paste it with the dlb repository. The gsl vector/matrix types are very simple and I would assume that the whole algorithm would be around 100 lines.

From my experience, using the third party blas implementations are very handy when you have a lot of matrices which you need to invert independently of each others. Typically the libaries offer a "batched_lu_decompose" which is very fast for the set of problems as whole. For our use case this is slightly problematic since the ode solvers are iterative and multiple mpi processes are anyways used.

@kahilah
Copy link

kahilah commented Jan 28, 2025

Hi,

I agree with @hamsteri15 that it is good to have flexibility to be able to choose the linear algebra package paths as depending on clusters they tend to vary - a lot - .

For example back in the days the include / lib structure varied depending on lapack version. I had different environment variables at Universty, at work and at home ;)

Regarding performance, there are clear gains when using third party lapack library. First of all, lapack was chosen because it is known to be the best for dense / full matrices (with pyjac you have a dense / almost full matrix). I have some old presentations where I benchmarked different lapack implementations, KLU (SuiteSparse), superLU, MKL lapack etc. For sparse matrices of size >100 KLU starts to be faster but for dense matrices lapack was always fastest. On one particular cluster I worked in the past, the native MKL lapack implementation was faster than openblas but that was most probably due to better intel architecture specific compilation of the MKL.

I have no experience on gsl so not sure what kind of matrices it targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants