-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix/CI] Fix broken kernels/test_mha.py
ready
ONLY add when PR is ready to merge/full CI is needed
#12450
opened Jan 26, 2025 by
tlrmchlsmth
Loading…
updated Jan 26, 2025
[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels
ci/build
#12294
opened Jan 22, 2025 by
fenghuizhang
Loading…
updated Jan 26, 2025
[Build/CI] Fix libcuda.so linkage
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12424
opened Jan 25, 2025 by
tlrmchlsmth
Loading…
updated Jan 26, 2025
Implements dual-chunk-flash-attn backend for dual chunk attention with sparse attention support
ci/build
#11844
opened Jan 8, 2025 by
sighingnow
Loading…
updated Jan 26, 2025
[Misc] Separate hf dataset sampling function from benchmark_serving.py
#12447
opened Jan 26, 2025 by
Isotr0py
Loading…
updated Jan 26, 2025
[Frontend] Support scores endpoint in run_batch
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#12430
opened Jan 25, 2025 by
pooyadavoodi
Loading…
updated Jan 26, 2025
[Bugfix][Kernel] Fix perf regression caused by PR #12405
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12434
opened Jan 26, 2025 by
LucasWilkinson
Loading…
updated Jan 26, 2025
[Bugfix] Fix Granite 3.0 MoE model loading
ready
ONLY add when PR is ready to merge/full CI is needed
#12446
opened Jan 26, 2025 by
DarkLight1337
Loading…
updated Jan 26, 2025
[Model]: Add Improvements or additions to documentation
transformers
backend support
ci/build
documentation
#11330
opened Dec 19, 2024 by
ArthurZucker
Loading…
updated Jan 26, 2025
[Frontend][Core][Enhacement] Add S3 and remote source support for LoRA adapters
ci/build
documentation
Improvements or additions to documentation
frontend
needs-rebase
#12029
opened Jan 14, 2025 by
Prashant18
Loading…
updated Jan 26, 2025
4 tasks done
[FlashInfer] Upgrade to 0.2.0
ci/build
documentation
Improvements or additions to documentation
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#11194
opened Dec 14, 2024 by
abmfy
Loading…
updated Jan 26, 2025
[V1][Metrics] Add initial Prometheus logger
ready
ONLY add when PR is ready to merge/full CI is needed
#12416
opened Jan 24, 2025 by
markmc
Loading…
updated Jan 26, 2025
[Distributed][refactor] Add base class for device-specific communicator
#11324
opened Dec 19, 2024 by
MengqingCao
Loading…
updated Jan 26, 2025
[Hardware][Ascend] Add Ascend NPU backend
ci/build
needs-rebase
#8054
opened Aug 31, 2024 by
wangshuai09
Loading…
updated Jan 26, 2025
12 tasks done
[Platform] add pre_register_and_update function
#12432
opened Jan 26, 2025 by
wangxiyuan
Loading…
updated Jan 26, 2025
[Hardware][Gaudi][Feature] Support Contiguous PA
#12139
opened Jan 17, 2025 by
zhouyu5
Loading…
updated Jan 26, 2025
[Hardware][Gaudi][Bugfix] Fix error for guided decoding
ci/build
#12317
opened Jan 22, 2025 by
zhouyu5
Loading…
updated Jan 26, 2025
[Misc] Add offline test for disaggregated prefill
#12418
opened Jan 24, 2025 by
Shaoting-Feng
Loading…
updated Jan 26, 2025
[Kernel] add triton fused moe kernel for gptq/awq
moe
quantization
ready
ONLY add when PR is ready to merge/full CI is needed
#12185
opened Jan 18, 2025 by
jinzhen-lin
Loading…
updated Jan 26, 2025
LoRA Support for Ultravox model
documentation
Improvements or additions to documentation
#11253
opened Dec 17, 2024 by
thedebugger
Loading…
updated Jan 26, 2025
[Hardware][Intel GPU] add XPU bf16 support
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#12392
opened Jan 24, 2025 by
jikunshang
Loading…
updated Jan 26, 2025
add support for AMD MI25/50/60
#12431
opened Jan 26, 2025 by
Said-Akbar
Loading…
updated Jan 26, 2025
[Misc] Add BNB quantization for Whisper
#12381
opened Jan 24, 2025 by
jeejeelee
Loading…
updated Jan 26, 2025
[V1][Spec Decode] Ngram Spec Decode
#12193
opened Jan 19, 2025 by
LiuXiaoxuanPKU
Loading…
updated Jan 25, 2025
4 of 5 tasks
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.