[Bug]: Unsloth bitsandbytes quantized model cannot be run due to: KeyError: 'layers.42.mlp.down_proj.weight.absmax
#10710
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
Hello,
I would like to run unsloth/Meta-Llama-3.1-70B-bnb-4bit using the
AsyncLLMEngine
in the following way:Please note that
lora_request
isNone
andenable_lora
is False in this case. I get:although I was under the impression that with the newer
vllm
versionsbnb
quantized models are supported.Could it be something like what I do in the following function, which requires manual patching to the weights in some cases for the
PEFTModels
:Thanks in advance!
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: