Skip to content

x90skysn3k/llama-cpp-python-cuBLAS-wheels

 
 

Repository files navigation

llama-cpp-python cuBLAS wheels

Wheels for llama-cpp-python compiled with cuBLAS support.

Requirements:

  • Windows and Linux x86_64
  • CPU with support for AVX, AVX2 or AVX512
  • CUDA 11.6 - 12.2
  • CPython 3.7 - 3.11

Experimental Windows ROCm build for AMD GPUs: https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/tag/rocm

Installation instructions:

To install, you can use this command:

python -m pip install llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117

This will install the latest llama-cpp-python version available from here for CUDA 11.7. You can change cu117 to change the CUDA version.
You can also change AVX2 to AVX or AVX512 based on what your CPU supports.

You can install a specific version with:

python -m pip install llama-cpp-python==<version> --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117

An example for installing 0.1.62 for CUDA 12.1 on a CPU without AVX2 support:

python -m pip install llama-cpp-python==0.1.62 --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cu121

List of available versions:

python -m pip index versions llama-cpp-python --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117

If you are replacing an already existing installation, you may need to uninstall that version before running the command above.
You can also replace the existing version in one command like so:

python -m pip install llama-cpp-python --force-reinstall --no-deps --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117
-OR-
python -m pip install llama-cpp-python==0.1.66 --force-reinstall --no-deps --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117
-OR-
python -m pip install llama-cpp-python --prefer-binary --upgrade --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu117

Wheels can be manually downloaded from: https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels


All wheels are compiled using GitHub Actions

About

Wheels for llama-cpp-python compiled with cuBLAS support

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HTML 99.0%
  • Other 1.0%