Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025
#11862 opened Jan 8, 2025 by simon-mo
Open 3
vLLM's V1 Engine Architecture
#8779 opened Sep 24, 2024 by simon-mo
Open 11
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Release v0.7.1 release Related to new version release
#12465 opened Jan 27, 2025 by simon-mo
4 tasks
[Bug]: the most recent xla nightly is breaking vllm on TPU bug Something isn't working
#12451 opened Jan 26, 2025 by hosseinsarshar
1 task done
[New Model]: IDEA-Research/ChatRex-7B new model Requests to new models
#12444 opened Jan 26, 2025 by Fr0do
1 task done
[Bug]: nrt_tensor_allocate status=4 message="Allocation Failure" on AWS Neuron bug Something isn't working
#12443 opened Jan 26, 2025 by StefanDimitrov95
1 task done
[Bug]: Could not run '_C::rms_norm' with arguments from the 'CUDA' backend. bug Something isn't working
#12441 opened Jan 26, 2025 by 851780266
1 task done
Flash Attention 3 (FA3) Support
#12429 opened Jan 25, 2025 by mgoin
3 tasks
[Installation]: no module named "resources" installation Installation problems
#12425 opened Jan 25, 2025 by Omni-NexusAI
1 task done
[Bug]: Performance regression when use PyTorch regional compilation bug Something isn't working
#12410 opened Jan 24, 2025 by anko-intel
1 task done
[Bug]: Slower inference time on less input tokens bug Something isn't working
#12406 opened Jan 24, 2025 by vishalkumardas
1 task done
[Bug]: InternVL2-26B-AWQ Service startup failure bug Something isn't working
#12404 opened Jan 24, 2025 by CallmeZhangChenchen
1 task done
[Bug]: AsyncEngineDeadError during inference of two vllm engine on single gpu bug Something isn't working
#12401 opened Jan 24, 2025 by semensorokin
1 task done
[Performance]: Unexpected performance of vLLM Cascade Attention performance Performance-related issues
#12395 opened Jan 24, 2025 by lauthu
1 task done
[Usage]: use vllm to serve gguf model with cpu only usage How to use vllm
#12391 opened Jan 24, 2025 by pamdla
1 task done
[Performance]: Details about the performance of vLLM on reasoning models performance Performance-related issues
#12387 opened Jan 24, 2025 by shaoyuyoung
1 task done
ProTip! Updated in the last three days: updated:>2025-01-24.