Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text embedding模型量化 #289

Open
Data-Adventure opened this issue Mar 4, 2025 · 5 comments
Open

text embedding模型量化 #289

Data-Adventure opened this issue Mar 4, 2025 · 5 comments

Comments

@Data-Adventure
Copy link

我有一个text embedding模型,目前我使用非量化的fp精度,可以导出.rknn模型,但是还想进一步进行int8或者w8a8量化。请问该如何进行呢?

@yuyun2000
Copy link

建议放弃

@yuyun2000
Copy link

不过我也试过,大多数层都是正常的,你要尝试可以输入npy量化试试,也可以只量化中间的linear看看

@Data-Adventure
Copy link
Author

不过我也试过,大多数层都是正常的,你要尝试可以输入npy量化试试,也可以只量化中间的linear看看

谢谢解答,我会试试看输入npy量化的,不过当前我遇到的问题是,npy如何组织?因为这个模型输入分别为:input_ids、attention_mask、token_type_ids,我需要将语句预处理成这三个,然后再组成npy么?

@yuyun2000
Copy link

是的,支持多npy输入进行量化,具体看文档示例

@Data-Adventure
Copy link
Author

是的,支持多npy输入进行量化,具体看文档示例

好的,感谢支持,我会实验一下看看结果

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants