Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add required libcuda.so ci/build needs-rebase ready ONLY add when PR is ready to merge/full CI is needed
#6864 opened Jul 27, 2024 by sdake Loading…
[Model] Teleflm Support
#6822 opened Jul 26, 2024 by horizon94 Loading…
Prefetch all needs-rebase
#6817 opened Jul 26, 2024 by gnpinkert Loading…
[DOC] Correct warning about performance
#6654 opened Jul 22, 2024 by casper-hansen Loading…
[ CI ] Awq Marlin Integration Tests ci/build needs-rebase ready ONLY add when PR is ready to merge/full CI is needed
#6627 opened Jul 22, 2024 by robertgshaw2-neuralmagic Loading…
[WIP] Fp8 marlin grouped
#6608 opened Jul 20, 2024 by mgoin Draft
[Kernel] Unify the kernel used in flash attention backend needs-rebase ready ONLY add when PR is ready to merge/full CI is needed
#6052 opened Jul 2, 2024 by LiuXiaoxuanPKU Loading…
[Not for review] Pp adag proto
#6526 opened Jul 17, 2024 by ruisearch42 Draft
[Model] Add Support for GPTQ Fused MOE
#6502 opened Jul 17, 2024 by izhuhaoran Loading…
[Not for review] Spmd tp rebase ready ONLY add when PR is ready to merge/full CI is needed
#6483 opened Jul 16, 2024 by ruisearch42 Draft
[Not for review] PP ADAG
#6448 opened Jul 15, 2024 by ruisearch42 Draft
[ Misc ] Support Act Order in Compressed Tensors needs-rebase ready ONLY add when PR is ready to merge/full CI is needed
#6358 opened Jul 12, 2024 by robertgshaw2-neuralmagic Loading…
[WIP] Emulated fp8 inference
#6111 opened Jul 3, 2024 by mgoin Draft
ProTip! Updated in the last three days: updated:>2025-01-09.