-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text embedding模型量化 #289
Comments
建议放弃 |
不过我也试过,大多数层都是正常的,你要尝试可以输入npy量化试试,也可以只量化中间的linear看看 |
谢谢解答,我会试试看输入npy量化的,不过当前我遇到的问题是,npy如何组织?因为这个模型输入分别为:input_ids、attention_mask、token_type_ids,我需要将语句预处理成这三个,然后再组成npy么? |
是的,支持多npy输入进行量化,具体看文档示例 |
好的,感谢支持,我会实验一下看看结果 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
我有一个text embedding模型,目前我使用非量化的fp精度,可以导出.rknn模型,但是还想进一步进行int8或者w8a8量化。请问该如何进行呢?
The text was updated successfully, but these errors were encountered: