-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Model] Support Mantis(LLaVA) model
ci/build
documentation
Improvements or additions to documentation
#10711
opened Nov 27, 2024 by
DarkLight1337
•
Draft
[Bugfix][Mamba] Fix Multistep on Mamba-like models
ready
ONLY add when PR is ready to merge/full CI is needed
#10705
opened Nov 27, 2024 by
mzusman
Loading…
[WIP][CI]add genai-perf benchmark in nightly benchmark
ci/build
#10704
opened Nov 27, 2024 by
jikunshang
Loading…
[Doc] Update model in arch_overview.rst to match comment
documentation
Improvements or additions to documentation
#10701
opened Nov 27, 2024 by
spacewander
Loading…
[Model] support bitsandbytes quantization with minicpm3 model
#10682
opened Nov 27, 2024 by
zixuanzhang226
Loading…
[3/N] Support and implement merged input processor for LLaVA model
#10676
opened Nov 26, 2024 by
DarkLight1337
Loading…
[Bugfix] Fix GGUF inference with FP16 unquantized checkpoint
ready
ONLY add when PR is ready to merge/full CI is needed
#10675
opened Nov 26, 2024 by
Isotr0py
Loading…
[Doc] Add github links for source code references
documentation
Improvements or additions to documentation
#10672
opened Nov 26, 2024 by
russellb
Loading…
[Core] Integrate Fastsafetensor loader for loading model weights
ci/build
documentation
Improvements or additions to documentation
#10647
opened Nov 26, 2024 by
manish-sethi
•
Draft
[V1] VLM - Support running the mm_mapper preprocessor in the frontend process
frontend
needs-rebase
#10640
opened Nov 25, 2024 by
alexm-neuralmagic
Loading…
[Frontend] don't block event loop in tokenization (preprocess) in OpenAI compatible server
frontend
#10635
opened Nov 25, 2024 by
tomeras91
Loading…
[Misc] Allow LoRA to adaptively increase rank and remove possible_max_ranks
#10623
opened Nov 25, 2024 by
JinhyunBang
Loading…
[Core][Bugfix] Use correct device to initialize GPU data during CUDA-graph-capture
#10608
opened Nov 24, 2024 by
IdoAsraff
Loading…
[fix] Correct num_accepted_tokens counting
ready
ONLY add when PR is ready to merge/full CI is needed
#10604
opened Nov 24, 2024 by
KexinFeng
Loading…
[ Kernels ] [ AMD ] Add Fused MoE Configs
#10574
opened Nov 22, 2024 by
robertgshaw2-neuralmagic
•
Draft
[Hardware][Intel-Gaudi] Enable LoRA support for Intel Gaudi (HPU)
#10565
opened Nov 22, 2024 by
SanjuCSudhakaran
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.