InternLM / lmdeploy Public

Notifications You must be signed in to change notification settings
Fork 445
Star 4.9k

Code
Issues 325
Pull requests 30
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: InternLM/lmdeploy

[Benchmark] benchmarks on different cuda architecture with mo...

#815 opened Dec 11, 2023 by lvhan028

Open 9

A100算力加持！书生大模型实战营第3期全面升级，趣味闯关模式等你开启

#2021 opened Jul 15, 2024 by boshallen

Open

Labels 34 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

22 Open 53 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug] Numerical Error in Flash Attention

#2957 opened Dec 25, 2024 by YYue000

3 tasks done

[Bug] generation profile hangs on Mixtral-8x7B-Instruct-v0.1 with pytorch backend

#2948 opened Dec 24, 2024 by zhulinJulia24

3 tasks

How to deploy model with its lora adapter

#2852 opened Dec 4, 2024 by pulkitmehtaworkmetacube

[Bug] Does PytorchEngine Visual Model Support Prefix Caching?

#2789 opened Nov 21, 2024 by OftenDream

3 tasks

[Bug] pytorch backend 's precision points loss 1.0-2.5 points between main code and v0.6.1 on some models.

#2679 opened Oct 29, 2024 by zhulinJulia24

3 tasks

[Bug] qwen2-vl-72b无法使用

#2622 opened Oct 18, 2024 by bltcn

3 tasks done

[Docs] 问lmdeploy中的w8a8-triton实现是否有实际llm（如llama2，qwen2）的推理速度加速效果的benchmark测试？

#2567 opened Oct 9, 2024 by brisker

[Bug] output not consistent with different max_prefill_token_num for long context input on pytorch engine

#2457 opened Sep 12, 2024 by RunningLeon

3 tasks done

[Bug] triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 108672, Hardware limit: 101376. Reducing block sizes or num_stages may help.

#2451 opened Sep 12, 2024 by EvoNexusX

3 tasks done

[Bug] use openai server, request get asyncio.exceptions.TimeoutError

#2360 opened Aug 22, 2024 by wlwqq

3 tasks done

smooth 量化后推理性能没有提升

#2304 opened Aug 14, 2024 by zxy1119

[Bug] AWQ Model Fails Loading ADapter

#1915 opened Jul 3, 2024 by vladrad

1 of 2 tasks

请问什么时候会支持对CogVLM2的量化

#1902 opened Jul 3, 2024 by EasonGZY

[Bug]

#1894 opened Jul 1, 2024 by CodexDive

2 tasks

[Feature] support torch 2.3.0

#1558 opened May 8, 2024 by NiuBlibing

[Feature] 支持Triton版本FA V2

#1454 opened Apr 18, 2024 by zhangxiao-stack

[Bug] PyTorch Engine poor performance compared to vllm

#1449 opened Apr 18, 2024 by jjjjohnson

2 tasks done

[Feature] 给 lmdeploy pytorch引擎，添加一个权重参数加载精度的参数。

#1398 opened Apr 6, 2024 by hello-gary-2022

S-LoRA with pytorch backend is very slow

#1370 opened Mar 29, 2024 by wanzhenchn

[Docs] got an unexpected keyword argument 'enable_lora'

#1151 opened Feb 19, 2024 by sleepwalker2017

想请教怎么往kv cache中添加缓存信息

#994 opened Jan 19, 2024 by WCwalker

[Deploy Error] ValueError: If eos_token_id is defined, make sure that pad_token_id is defined.

#522 opened Oct 1, 2023 by vansin

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly