-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[RFC]: Multi-modality Support on vLLM
feature request
RFC
#4194
opened Apr 19, 2024 by
ywang96
45 of 78 tasks
[Misc]: Throughput/Latency for guided_json with ~100% GPU cache utilization
misc
structured-output
#3567
opened Mar 22, 2024 by
jens-create
[Installation] pip install vllm (0.6.3) will force a reinstallation of the CPU version torch and replace cuda torch on windows
installation
Installation problems
#9701
opened Oct 25, 2024 by
xiezhipeng-git
Recent vLLMs ask for too much memory: ValueError: No available memory for the cache blocks. Try increasing Something isn't working
unstale
gpu_memory_utilization
when initializing the engine.
bug
#2248
opened Dec 24, 2023 by
pseudotensor
[Performance]: decoding speed on long context
performance
Performance-related issues
#11286
opened Dec 18, 2024 by
155394551lzk
1 task done
[Bug]: vllm.engine.async_llm_engine.AsyncEngineDeadError: Background loop has errored already.
bug
Something isn't working
#5060
opened May 26, 2024 by
heungson
Is there a way to terminate vllm.LLM and release the GPU memory
#1908
opened Dec 4, 2023 by
sfc-gh-zhwang
API causes slowdown in batch request handling
bug
Something isn't working
unstale
#1707
opened Nov 17, 2023 by
jpeig
Generate nothing from VLLM output
bug
Something isn't working
#1185
opened Sep 26, 2023 by
FocusLiwen
[Bug]: v0.6.4.post1 crashed:Error in model execution: CUDA error: an illegal memory access was encountered
bug
Something isn't working
#10389
opened Nov 16, 2024 by
wciq1208
1 task done
[Usage]: Does serving the model in **manual** way differ than the **predefined** *(OpenAI)* way? A quick question, please guide
usage
How to use vllm
#11569
opened Dec 27, 2024 by
AayushSameerShah
[Bug]: VLLM 0.5.3.post1 [rank0]: RuntimeError: NCCL error: unhandled cuda error (run with NCCL_DEBUG=INFO for details)
bug
Something isn't working
#6732
opened Jul 24, 2024 by
jueming0312
[Model] DeepSeek-V3 Enhancements
new model
Requests to new models
performance
Performance-related issues
#11539
opened Dec 27, 2024 by
simon-mo
2 of 10 tasks
[Usage]: how to use EAGLE on vLLM?
usage
How to use vllm
#11126
opened Dec 12, 2024 by
xiongqisong
1 task done
[Bug]: Is vllm support function call mode?
bug
Something isn't working
#6631
opened Jul 22, 2024 by
FanZhang91
[Bug]: Qwen1.5-14B-Chat使用vllm==0.3.3版本在Tesla V100-PCIE-32GB显卡上部署结果全部是感叹号,无结果
bug
Something isn't working
#3998
opened Apr 11, 2024 by
li995495592
Could not build wheels for vllm, which is required to install pyproject.toml-based projects
installation
Installation problems
stale
#1391
opened Oct 17, 2023 by
ABooth01
[Bug]: No available block found in 60 second in shm
bug
Something isn't working
#6614
opened Jul 21, 2024 by
wjj19950828
[New Model]: Qwen/QwQ-32B-Preview
new model
Requests to new models
#10737
opened Nov 28, 2024 by
SionicAI-Engineering
1 task done
[Performance]: phi 3.5 vision model consuming high CPU RAM and the process getting killed
performance
Performance-related issues
stale
#9190
opened Oct 9, 2024 by
kuladeephx
1 task done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.