-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chapter 6 - ModuleNotFoundError #16
Comments
Thank you for sharing this issue. Could you try the following instead: from llama_cpp import Llama It seems that the API must have been updated and considering this is a very minor change, I have no problems implementing that here to make sure that we can use the latest version of @ritesh2014 Could you let me know if this works? |
The problem is - After that I could import - from llama_cpp.llama import Llama |
Ah, that's rather odd as I had previous success with the latest version. Perhaps Google Colab has updated some dependencies that might have made this more difficult. Either way, thanks for sharing your issue and the solution! I might change it to I think you would also be able to use the pre-built wheels: !pip install --no-cache-dir llama-cpp-python==0.3.2 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122 |
@MaartenGr Hi, I try this pip install command in Colab T4 Env, it seen only used CPU, not used GPU resources, when I run this cell code # Generate output
output = llm.create_chat_completion(
messages=[
{"role": "user", "content": "Create a warrior for an RPG in JSON format."},
],
response_format={"type": "json_object"},
temperature=0,
)['choices'][0]['message']["content"] |
@jyellow Thank you for sharing! It may be that no wheels are available yet for that specific version. I tried it with the following: !pip install --no-cache-dir llama-cpp-python==0.2.76 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122 and I can see that the model is loaded onto the GPU, so that should work! |
@MaartenGr thanks, you're right. I would like to further clarify that we should avoid installing the GPU-supported version of the llama-cpp-python library in the Colab environment using the !pip install --no-cache-dir llama-cpp-python==0.2.90 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122 |
@jyellow That is indeed correct! It used to be the default version of installing |
Chapter 6
I am running the below in Colab connected to T4.
%%capture
!pip install langchain>=0.1.17 openai>=1.13.3 langchain_openai>=0.1.6 transformers>=4.40.1 datasets>=2.18.0 accelerate>=0.27.2 sentence-transformers>=2.5.1 duckduckgo-search>=5.2.2
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
from llama_cpp.llama import Llama
Load Phi-3
llm = Llama.from_pretrained(
repo_id="microsoft/Phi-3-mini-4k-instruct-gguf",
filename="*fp16.gguf",
n_gpu_layers=-1,
n_ctx=2048,
verbose=False
)
ModuleNotFoundError Traceback (most recent call last)
in <cell line: 1>()
----> 1 from llama_cpp.llama import Llama
2
3 # Load Phi-3
4 llm = Llama.from_pretrained(
5 repo_id="microsoft/Phi-3-mini-4k-instruct-gguf",
ModuleNotFoundError: No module named 'llama_cpp'
Please note this error occurs even after restarting the runtime
The text was updated successfully, but these errors were encountered: