-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] notebook量化时torch.OutOfMemoryError: CUDA out of memory. #2915
Comments
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. |
Checklist
Describe the bug
版本:0.6.3
配置: GPU T42 (162)
量化 glm4-9b awq时,我发现卡1近100%,卡2 0%,量化到结尾时爆错,应该是没有用到卡2。
是需要额外设置一些参数吗?
Reproduction
1
Environment
Error traceback
The text was updated successfully, but these errors were encountered: