-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
A more memory-efficient(1/9) and faster (10×) cuda kernel for performing top-k and top-p operations.
ci/build
needs-rebase
unstale
#1416
opened Oct 19, 2023 by
SuperCB
Loading…
Support generation from input embedding
frontend
needs-rebase
unstale
#1265
opened Oct 5, 2023 by
pfldy2850
Loading…
4 tasks done
[Model]: Add Improvements or additions to documentation
transformers
backend support
ci/build
documentation
#11330
opened Dec 19, 2024 by
ArthurZucker
Loading…
[Kernel][Quantization] Custom Floating-Point Runtime Quantization
ci/build
needs-rebase
#8751
opened Sep 23, 2024 by
AlpinDale
Loading…
4 tasks
AutoQuant: Automatic quantization model for INT8/INT4 inference, INT4 runs faster than AWQ and FP16.
ci/build
documentation
Improvements or additions to documentation
frontend
needs-rebase
unstale
#2801
opened Feb 7, 2024 by
ChengcanWang-zte
Loading…
GPTQ & AWQ Fused MOE
needs-rebase
unstale
#2761
opened Feb 5, 2024 by
chu-tianxiang
Loading…
3 tasks done
[RFC/WIP] First steps towards FP8 for Mixtral
ci/build
needs-rebase
unstale
#3208
opened Mar 5, 2024 by
pcmoritz
Loading…
[Core][Frontend] Add faster-outlines as guided decoding backend
ci/build
needs-rebase
structured-output
#10277
opened Nov 13, 2024 by
unaidedelf8777
Loading…
[Kernel][Core][WIP] Tree attention and parallel decoding
needs-rebase
unstale
#4325
opened Apr 24, 2024 by
yukavio
Loading…
[Model] Add moondream vision language model
documentation
Improvements or additions to documentation
needs-rebase
unstale
#4228
opened Apr 20, 2024 by
vikhyat
Loading…
[torch.compile] A simple solution to recursively compile loaded model: using phi3-small as an example
#8398
opened Sep 12, 2024 by
wschin
Loading…
[Kernel] Add prefix-caching support for phi-3-small-8k/128k model triton kernel
needs-rebase
#8345
opened Sep 10, 2024 by
congcongchen123
Loading…
[Kernels] Add an inductor pass to rewrite and fuse collective communication ops with gemms
frontend
needs-rebase
#9886
opened Oct 31, 2024 by
bnellnm
Loading…
[WIP][Model][Kernel][Bugfix] Commits for new MSFT PhiMoE model
ci/build
needs-rebase
unstale
#7691
opened Aug 20, 2024 by
wenxcs
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.