-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[model] AddRelPositionMultiHeadedAttention
documentation
Improvements or additions to documentation
unstale
#4956
opened May 21, 2024 by
rajveer43
Loading…
[Hardware][Intel] fp8 kv cache support for CPU
x86 CPU
#5492
opened Jun 13, 2024 by
jikunshang
•
Draft
[Frontend] support only use linear lora modules in attention
needs-rebase
unstale
#5483
opened Jun 13, 2024 by
jinzhen-lin
Loading…
[Usage] Clarify and Update Argument for Specifying Model Revisions
frontend
needs-rebase
unstale
#5453
opened Jun 12, 2024 by
Etelis
Loading…
[Bugfix] Take the VRAM usage of prompt_logprobs into account
needs-rebase
unstale
#5355
opened Jun 8, 2024 by
Conless
Loading…
[BugFix]Fix the problem that StopChecker assumes a single token produ…
needs-rebase
unstale
#5243
opened Jun 4, 2024 by
IcyFeather233
Loading…
[Misc] Adding Speculative decoding to Throughput Benchmarking script
needs-rebase
unstale
#5223
opened Jun 3, 2024 by
abhibambhaniya
Loading…
[Bugfix] [Frontend] vLLM api_server.py when using with prompt_token_ids causes error.
frontend
needs-rebase
unstale
#5187
opened Jun 1, 2024 by
TikZSZ
Loading…
[Core] Bump up the default of --gpu_memory_utilization to be more similar to TensorRT Triton's default
documentation
Improvements or additions to documentation
frontend
needs-rebase
unstale
#5158
opened May 31, 2024 by
alexm-neuralmagic
Loading…
[KERNEL] int8 quantization kernel refactoring & optimization WIP
needs-rebase
unstale
#5146
opened May 31, 2024 by
ZelboK
Loading…
[Bugfix] Adds outlines performance improvement
#5053
opened May 26, 2024 by
lynkz-matt-psaltis
•
Draft
[WIP] Make chunekd prefill work with lora
needs-rebase
#4994
opened May 23, 2024 by
rkooo567
Loading…
[Misc] Support HF Hub remote loading for LoRA adapters
documentation
Improvements or additions to documentation
needs-rebase
unstale
#4939
opened May 21, 2024 by
Isotr0py
Loading…
3 tasks done
[CI/Build] Make marlin kernel build conditional.
ci/build
needs-rebase
#4905
opened May 19, 2024 by
esmeetu
Loading…
[Build/CI] Extending AMD Tests
ci/build
needs-rebase
unstale
#4875
opened May 17, 2024 by
Alexei-V-Ivanov-AMD
Loading…
Add a new kernel for fusing the dequantization in fused-moe gemm
needs-rebase
unstale
#4841
opened May 15, 2024 by
RezaYazdaniAminabadi
Loading…
[Misc] Logits processor plugins
documentation
Improvements or additions to documentation
frontend
needs-rebase
unstale
#4769
opened May 11, 2024 by
NadavShmayo
Loading…
[Misc] Added devcontainer to help vscode dev setup
unstale
#4720
opened May 9, 2024 by
ElefHead
Loading…
ProTip!
Follow long discussions with comments:>50.