-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add docker-compose.yml and corresponding .env
unstale
#2895
opened Feb 16, 2024 by
WolframRavenwolf
Loading…
Dynamic Multi LoRA Load \ Delete Support
frontend
needs-rebase
unstale
#3496
opened Mar 19, 2024 by
gauravkr2108
Loading…
silu_and_mul kernel only supports contiguous tensors
action-required
needs-rebase
unstale
#3324
opened Mar 11, 2024 by
mwbyeon
Loading…
Speedup model loading with safetensor format when gpu is available
needs-rebase
unstale
#3185
opened Mar 4, 2024 by
chaomengyuan
Loading…
feat: quadratic + cubic sampling
frontend
needs-rebase
unstale
#3167
opened Mar 3, 2024 by
AlpinDale
Loading…
fix: ignore non-LoRA tensors in adapter model
needs-rebase
unstale
#3151
opened Mar 1, 2024 by
AlpinDale
Loading…
chore(outputs): make return class into dataclass
unstale
#3017
opened Feb 24, 2024 by
aarnphm
Loading…
[Minor] Add benchmark for activation layer
needs-rebase
unstale
#3009
opened Feb 23, 2024 by
esmeetu
Loading…
support stop_token_ids_group like stop_str in check_stop
needs-rebase
unstale
#2926
opened Feb 20, 2024 by
LokiLiu
Loading…
Refactor openai completion api w.r.t Prefix Cache
frontend
needs-rebase
unstale
#2516
opened Jan 20, 2024 by
Avinash-Raj
Loading…
Explicit packed params in preparation for more LoRA support
needs-rebase
unstale
#2843
opened Feb 13, 2024 by
pcmoritz
Loading…
[CI/Build] A perplexity-computing test for the FP8 KV cache system. Originally used in the context of PR #3290
ci/build
needs-rebase
unstale
#3730
opened Mar 29, 2024 by
Alexei-V-Ivanov-AMD
Loading…
Update config.py for models which use_dynamic_ntk
needs-rebase
unstale
#1388
opened Oct 17, 2023 by
ZeyuTeng96
Loading…
A more memory-efficient(1/9) and faster (10×) cuda kernel for performing top-k and top-p operations.
ci/build
needs-rebase
unstale
#1416
opened Oct 19, 2023 by
SuperCB
Loading…
[WIP] Qwen-style dynamic-NTK ROPE kernel for long sequence support
needs-rebase
unstale
#1860
opened Nov 30, 2023 by
ZiyueHuang
Loading…
[FIX] Fix shape mismatch for swapped sequences when logprobs > 0
needs-rebase
unstale
#1971
opened Dec 7, 2023 by
derange-alembic
Loading…
Allow single LLM step to generate multiple tokens
needs-rebase
unstale
#2120
opened Dec 15, 2023 by
LiuXiaoxuanPKU
Loading…
Adds support for gunicorn multiprocess process
ci/build
frontend
needs-rebase
unstale
#2818
opened Feb 8, 2024 by
jalotra
Loading…
Prefix caching and deallocation mechanism
documentation
Improvements or additions to documentation
frontend
needs-rebase
unstale
#2511
opened Jan 19, 2024 by
jadielam
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.