Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

部署后,连续发送请求,推理耗时递增 #2924

Open
nzomi opened this issue Dec 18, 2024 · 0 comments
Open

部署后,连续发送请求,推理耗时递增 #2924

nzomi opened this issue Dec 18, 2024 · 0 comments

Comments

@nzomi
Copy link

nzomi commented Dec 18, 2024

开发者你好,
我用LMdeploy部署InternVL2后接了openAI的接口,发现如果我连续测试同一批图片,也就是连续发送多个请求,越靠后的请求推理耗时越长,如果一次发送的请求数量过多,排序靠后的就容易超时卡死。请问这有什么优化方案吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant