-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CI/Build] Fixes missing runtime dependencies
ci/build
#29822
opened Dec 1, 2025 by
bbartels
Loading…
5 tasks
[Perf] Async Scheduling + Speculative Decoding + Structured Outputs
needs-rebase
structured-output
tpu
Related to Google TPUs
v1
#29821
opened Dec 1, 2025 by
benchislett
•
Draft
[Frontend] refactor harmony utils output message parsing
frontend
gpt-oss
Related to GPT-OSS models
#29820
opened Dec 1, 2025 by
daniel-salib
Loading…
fix(shm): Add memory barriers for cross-process shared memory visibility
#29819
opened Dec 1, 2025 by
kitaekatt
Loading…
3 tasks done
[CI] Fix python install ci error
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#29818
opened Dec 1, 2025 by
yewentao256
Loading…
[Bugfix][Model] Support LoRA on Qwen3 Output Embedding
qwen
Related to Qwen models
#29816
opened Dec 1, 2025 by
klshuster
Loading…
5 tasks
[BugFix] Fix poetential out of bounds blocktable access
v1
#29811
opened Dec 1, 2025 by
LucasWilkinson
Loading…
[Kernel] Support rms_norm kernel for Gemma
performance
Performance-related issues
#29810
opened Dec 1, 2025 by
xyang16
Loading…
5 tasks
[llama4_eagle] add lm_head attribute to model class
llama
Related to Llama models
speculative-decoding
#29809
opened Dec 1, 2025 by
divakar-amd
•
Draft
SplitK with Atomic Reduce Counting for Skinny GEMMs
rocm
Related to AMD ROCm
#29807
opened Dec 1, 2025 by
amd-hhashemi
Loading…
5 tasks
Dummy update,- testing CI pipeline.
amd
ci/build
rocm
Related to AMD ROCm
#29806
opened Dec 1, 2025 by
Alexei-V-Ivanov-AMD
Loading…
5 tasks
[Misc][Hybrid allocator + kv connector] Optionally enable hybrid allocator + KV cache connector
kv-connector
v1
#29805
opened Dec 1, 2025 by
NickLucche
Loading…
Fix some Transformers nightly tests
qwen
Related to Qwen models
#29802
opened Dec 1, 2025 by
hmellor
Loading…
[Core] Eliminate redundant is_encoder_decoder lookups (20-40us/step)
v1
#29800
opened Dec 1, 2025 by
wushidonguc
Loading…
[Misc] Update conftest for entrypoints/sagemaker test folder
ready
ONLY add when PR is ready to merge/full CI is needed
#29799
opened Dec 1, 2025 by
zhaozuy
Loading…
3 of 5 tasks
[responsesAPI][5] ResponsesParser with tools for full MCP loop
frontend
gpt-oss
Related to GPT-OSS models
fix error while downloading dependencies for CPU backend
ci/build
#29797
opened Dec 1, 2025 by
MaoJianwei
Loading…
3 of 5 tasks
[Perf] Improve fp8 quant in mla; replace ReduceSum with ReduceScatterSum
nvidia
v1
#29795
opened Dec 1, 2025 by
IwakuraRein
Loading…
3 of 5 tasks
Support tokenization_kwargs override
frontend
multi-modality
Related to multi-modality (#4194)
#29794
opened Dec 1, 2025 by
piood
Loading…
[Chore] Move tokenizer initialization methods
deepseek
Related to DeepSeek models
multi-modality
Related to multi-modality (#4194)
performance
Performance-related issues
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
tool-calling
tpu
Related to Google TPUs
v1
#29793
opened Dec 1, 2025 by
DarkLight1337
Loading…
5 tasks
[BUILD] fix url joining behavior when downloading precompiled wheels
ci/build
#29792
opened Dec 1, 2025 by
Harry-Chen
Loading…
3 of 5 tasks
BUGFIX: Handle layer name inconsistencies in pipeline parallel training
nvidia
v1
#29789
opened Dec 1, 2025 by
penfever
Loading…
2 of 5 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.