vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.3k
Star 34.8k

Code
Issues 1.2k
Pull requests 475
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

Clear current search query, filters, and sorts

475 Open 5,378 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Core] tokens in queue metric

#12286 opened Jan 21, 2025 by annapendleton

Loading…

[ROCm] Faster Custom Paged Attention kernels ci/build rocm

#12348 opened Jan 23, 2025 by tjtanaa • Draft

[Bugfix] handle alignment of arguments in convert_sparse_cross_attention_mask_to_dense

#12347 opened Jan 23, 2025 by tjohnson31415

Loading…

FLOP counting for vLLM inference

#12341 opened Jan 23, 2025 by dianastea • Draft

[Build] Only build 9.0a for scaled_mm and sparse kernels ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#12339 opened Jan 23, 2025 by LucasWilkinson

Loading…

[Frontend] Generate valid tool call IDs when using tokenizer-mode=mistral frontend

#12332 opened Jan 22, 2025 by rafvasq

Loading…

[Core] Optimizing cross-attention QKVParallelLinear computation

#12325 opened Jan 22, 2025 by NickLucche

Loading…

2 tasks

[Hardware][Gaudi][Bugfix] Fix error for guided decoding ci/build

#12317 opened Jan 22, 2025 by zhouyu5

Loading…

[do-not-merge][perf-benchmark] cleanup unused docker images/containers ci/build perf-benchmarks

#12306 opened Jan 22, 2025 by khluu

Loading…

[Feature][Spec Decode] Simplify the use of Eagle Spec Decode

#12304 opened Jan 22, 2025 by ShangmingCai

Loading…

[Hardware][Gaudi][Feature] Enable Dynamic MoE for Mixtral

#12303 opened Jan 22, 2025 by zhenwei-intel

Loading…

[Core] Make disaggregated prefill compatible with pipeline parallelism

#12301 opened Jan 22, 2025 by YuhanLiu11

Loading…

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels ci/build

#12294 opened Jan 22, 2025 by fenghuizhang

Loading…

[Core] Optimize topp/topk calculation in sampler

#12156 opened Jan 17, 2025 by afierka-intel

Loading…

[Core] Prefill Only Tokens Without KV Cache in Batch Requests (Disagg Prefill)

#12285 opened Jan 21, 2025 by Shaoting-Feng

Loading…

[CI/Build] Add label automation for structured-output / speculative-decoding ci/build

#12280 opened Jan 21, 2025 by russellb

Loading…

NVIDIA Blackwell codegen ci/build documentation

Improvements or additions to documentation

#12271 opened Jan 21, 2025 by johnnynunez

Loading…

[Model] Enable Inference Support for the New Baichuan-M1 Model documentation

Improvements or additions to documentation

new model

Requests to new models

#12251 opened Jan 21, 2025 by rainkert

Loading…

[Misc] Move find_loaded_library to platform_aware_utils.py

#12231 opened Jan 20, 2025 by houseroad

Loading…

[VLM] Merged multi-modal processor for Pixtral

#12211 opened Jan 20, 2025 by Flechman • Draft

[V1][Spec Decode] Ngram Spec Decode

#12193 opened Jan 19, 2025 by LiuXiaoxuanPKU

Loading…

4 of 5 tasks

[Bugfix] fix race condition that leads to wrong order of token returned

#12192 opened Jan 19, 2025 by joennlae

Loading…

[Misc] Add Gemma2 GGUF support

#12186 opened Jan 18, 2025 by Isotr0py

Loading…

[Kernel] add triton fused moe kernel for gptq/awq moe quantization ready

ONLY add when PR is ready to merge/full CI is needed

#12185 opened Jan 18, 2025 by jinzhen-lin

Loading…

[Quantization/Parameter] WIP: Another Implementation of the Quantization Parameter Subclass Substitution

#12158 opened Jan 17, 2025 by cennn

Loading…

Previous 1 2 … 15 16 17 18 19 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly