Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 6 - ModuleNotFoundError #16

Open
ritesh2014 opened this issue Nov 23, 2024 · 7 comments
Open

Chapter 6 - ModuleNotFoundError #16

ritesh2014 opened this issue Nov 23, 2024 · 7 comments

Comments

@ritesh2014
Copy link

ritesh2014 commented Nov 23, 2024

Chapter 6
I am running the below in Colab connected to T4.

%%capture
!pip install langchain>=0.1.17 openai>=1.13.3 langchain_openai>=0.1.6 transformers>=4.40.1 datasets>=2.18.0 accelerate>=0.27.2 sentence-transformers>=2.5.1 duckduckgo-search>=5.2.2
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

from llama_cpp.llama import Llama

Load Phi-3

llm = Llama.from_pretrained(
repo_id="microsoft/Phi-3-mini-4k-instruct-gguf",
filename="*fp16.gguf",
n_gpu_layers=-1,
n_ctx=2048,
verbose=False
)

ModuleNotFoundError Traceback (most recent call last)
in <cell line: 1>()
----> 1 from llama_cpp.llama import Llama
2
3 # Load Phi-3
4 llm = Llama.from_pretrained(
5 repo_id="microsoft/Phi-3-mini-4k-instruct-gguf",

ModuleNotFoundError: No module named 'llama_cpp'

Please note this error occurs even after restarting the runtime

@MaartenGr
Copy link
Contributor

Thank you for sharing this issue. Could you try the following instead:

from llama_cpp import Llama

It seems that the API must have been updated and considering this is a very minor change, I have no problems implementing that here to make sure that we can use the latest version of llama-cpp-python.

@ritesh2014 Could you let me know if this works?

@ritesh2014
Copy link
Author

The problem is -
!CMAKE_ARGS="-DLLAMA_CUDA=off" pip install llama-cpp-python
never gets isntalled. I am assuming this tries to install the latest version 0.3.2
I tried with few other latest versions and was only able to find my luck with -
!CMAKE_ARGS="-DLLAMA_CUDA=off" pip install llama-cpp-python==0.2.76

After that I could import - from llama_cpp.llama import Llama
Not sure why several other versions does not gets installed in colab. I checked cmake also which was the latest. Also other dependencies were in place.
Anyways, this is working with 0.2.76.You can reply to this comment or feel free to close the thread

@MaartenGr
Copy link
Contributor

Ah, that's rather odd as I had previous success with the latest version. Perhaps Google Colab has updated some dependencies that might have made this more difficult.

Either way, thanks for sharing your issue and the solution!

I might change it to !CMAKE_ARGS="-DLLAMA_CUDA=off" pip install llama-cpp-python==0.2.76 but locally the newest version is working for me. We want to make sure that readers are not stuck to an old version of llama-cpp-python to run the examples, so perhaps a different type of installation might be preferred.

I think you would also be able to use the pre-built wheels:

!pip install --no-cache-dir llama-cpp-python==0.3.2 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122

@jyellow
Copy link

jyellow commented Dec 2, 2024

Ah, that's rather odd as I had previous success with the latest version. Perhaps Google Colab has updated some dependencies that might have made this more difficult.

Either way, thanks for sharing your issue and the solution!

I might change it to !CMAKE_ARGS="-DLLAMA_CUDA=off" pip install llama-cpp-python==0.2.76 but locally the newest version is working for me. We want to make sure that readers are not stuck to an old version of llama-cpp-python to run the examples, so perhaps a different type of installation might be preferred.

I think you would also be able to use the pre-built wheels:

!pip install --no-cache-dir llama-cpp-python==0.3.2 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122

@MaartenGr Hi, I try this pip install command in Colab T4 Env, it seen only used CPU, not used GPU resources, when I run this cell code

# Generate output
output = llm.create_chat_completion(
    messages=[
        {"role": "user", "content": "Create a warrior for an RPG in JSON format."},
    ],
    response_format={"type": "json_object"},
    temperature=0,
)['choices'][0]['message']["content"]

@MaartenGr
Copy link
Contributor

@jyellow Thank you for sharing! It may be that no wheels are available yet for that specific version. I tried it with the following:

!pip install --no-cache-dir llama-cpp-python==0.2.76 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122

and I can see that the model is loaded onto the GPU, so that should work!

@jyellow
Copy link

jyellow commented Dec 3, 2024

@jyellow Thank you for sharing! It may be that no wheels are available yet for that specific version. I tried it with the following:

!pip install --no-cache-dir llama-cpp-python==0.2.76 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122

and I can see that the model is loaded onto the GPU, so that should work!

@MaartenGr thanks, you're right. I would like to further clarify that we should avoid installing the GPU-supported version of the llama-cpp-python library in the Colab environment using the CMAKE_ARGS="-DGGML_CUDA=on" approach. This method triggers a compilation process to build the CUDA support library from source, and the CPU performance of Colab’s T4 environment does not seem to be optimal. This step consumes a significant amount of Colab T4 usage time and wastes GPU resources.
Instead, it is better to use the command you provided to directly install the prebuilt package with GPU support, such as:

!pip install --no-cache-dir llama-cpp-python==0.2.90 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122

@MaartenGr
Copy link
Contributor

@jyellow That is indeed correct! It used to be the default version of installing llama-cpp-python but they have since started building wheels. A quick note, always check whether it is actually building the wheels. It seems that not all versions have pre-built wheels. I see that you mentioned 0.2.90 using pre-built wheels. Can you confirm that that version works? I can confirm 0.2.76 works with the pre-built wheels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants