-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
33B and 65B weights? #94
Comments
Is it gonna work just with |
Doesn't seem to work.
|
To work with the 30B model, it is necessary to change lines 34 and 35 of the main.cpp file to the value 1. Originally, the 30B file was divided into 4 parts, just as the 13B file was divided into 2 parts. // determine number of model parts based on the dimension
static const std::map<int, int> LLAMA_N_PARTS = {
{ 4096, 1 },
{ 5120, 1 },
{ 6656, 4 },
{ 8192, 8 },
}; Change it to: // determine number of model parts based on the dimension
static const std::map<int, int> LLAMA_N_PARTS = {
{ 4096, 1 },
{ 5120, 1 },
{ 6656, 1 },
{ 8192, 1 },
}; After that, just recompile and run it again. Credits to the user ItsPi3141, who gave the answer here: Issues 83 |
You dont need to touch any code for this.
edit: Actually i was assuming llama.cpp (not this fork) |
What value do you put? |
Actually i am assuming llama.cpp (not this fork)
If it is a single model file, 1 |
Getting the same issue :( |
Are you saying this method is valid for llama.cpp but not alpaca.cpp? |
yea. @antimatter15 are there any things left in your fork that did not get upstreamed yet? |
Has anyone gotten this 30B model working with the method above yet? If so, how does it compare to the current 7B and 13B weights? I haven't had a chance to check the implications of this hotfix above in the source code. Is this a change we could push to main and add support for these larger models? (I will be testing this method when I get home later) |
I do believe the author is referring to the chat.cpp file, not main.cpp. |
I've had a chance to implement this method using the 30B weight and test. It works! Upon initial testing, this model seems to be very impressive. While I don't have a baseline to test against, I suspect it is performing better than the 7B and 13B models currently supported. My specs are: I don't see any reason why the hotfix above to run the 30B weight, as well as adding documentation to the README should not be pushed to main? @antimatter15 if I forked, implemented this feature (support for 30B weight) including readme documentation, and submitted a PR would you accept? |
The code snippet to add support for 30B weight has already been merged in #104 . I've just submitted a PR #108 to add support/instructions to the README on how to get the 30B weight running. @antimatter15 would be very helpful if you would accept this PR #108. A lot of people would love this :) |
For this issue try to recompile chat script with : make chat. For me it's working. |
What would it take to use 33B and 65B weights?
Also, 7B seems to work better than 13B right now.
The text was updated successfully, but these errors were encountered: