Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024
#9006 opened Oct 1, 2024 by simon-mo
Open 23
vLLM's V1 Engine Architecture
#8779 opened Sep 24, 2024 by simon-mo
Open 9
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Usage]: vllm infer with 2 * Nvidia-L20, output repeat !!!! usage How to use vllm
#10713 opened Nov 27, 2024 by RoyaltyLJW
1 task done
[Performance]: Unified flashattn kernel not outperforming current one performance Performance-related issues
#10707 opened Nov 27, 2024 by NickLucche
1 task done
[Bug]: VLLM run very very slow in ARM cpu bug Something isn't working
#10706 opened Nov 27, 2024 by feikiss
1 task done
[Bug]: load llama 70B more than 10min, is that right? bug Something isn't working
#10702 opened Nov 27, 2024 by ltm920716
1 task done
[Usage]: 4 Bit Finetuned Mistral Model usage How to use vllm
#10697 opened Nov 27, 2024 by anandmahato
1 task done
[Bug]: MambaCacheManager Can Possibly Run Out of Free Slots bug Something isn't working
#10693 opened Nov 27, 2024 by fabianlim
1 task done
[Bug]: CPU Docker build fail. bug Something isn't working
#10689 opened Nov 27, 2024 by Zhenzhong1
1 task done
[Bug]: v0.6.4.post1 Qwen2-VL-7B-Instruct-AWQ crash:shape mismatch bug Something isn't working
#10686 opened Nov 27, 2024 by wciq1208
1 task done
[RFC]: Make any model an embedding model RFC
#10674 opened Nov 26, 2024 by DarkLight1337
1 task done
[Usage]: Llama-2-7b-chat-hf as embedding model usage How to use vllm
#10673 opened Nov 26, 2024 by ra-MANUJ-an
1 task done
[Usage]: how to get every output token score? usage How to use vllm
#10670 opened Nov 26, 2024 by TonyUSTC
[Usage]: Cannot use xformers with old GPU usage How to use vllm
#10662 opened Nov 26, 2024 by baimushan
1 task done
ProTip! Updated in the last three days: updated:>2024-11-24.