Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTPQ量化Qwen2.5-32B-Instruct LORA微调后的版本报错:torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 25942 is not positive-definite). #6057

Open
1 task done
camposs1979 opened this issue Nov 18, 2024 · 0 comments
Labels
pending This problem is yet to be addressed

Comments

@camposs1979
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.9.1.dev0
  • Platform: Linux-5.15.0-47-generic-x86_64-with-glibc2.35
  • Python version: 3.10.15
  • PyTorch version: 2.4.1+cu121
  • Transformers version: 4.43.4
  • Datasets version: 2.21.0
  • Accelerate version: 0.34.2
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • DeepSpeed version: 0.14.5
  • Bitsandbytes version: 0.44.1
  • vLLM version: 0.4.3

Reproduction

model

model_name_or_path: models/qwen25_32B_lora_sft
template: qwen

export

export_dir: models/Qwen2.5-32B-Instruct-GPTQ-Int4
export_quantization_bit: 4
export_quantization_dataset: data/c4_demo.json
export_size: 2
export_device: cpu
export_legacy_format: false

Expected behavior

在对Qwen2.5-32B-Instruct进行LORA微调并合并以后,再进行4Bit量化,过程中报错:
Quantizing model.layers blocks : 20%|████████▎ | 13/64 [26:54<1:45:34, 124.20s/it]
_Traceback (most recent call last):
File "/root/miniconda3/envs/xtuner-env/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/root/autodl-tmp/LLaMA-Factory-main/src/llamafactory/cli.py", line 87, in main
export_model()
File "/root/autodl-tmp/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 76, in export_model
model = load_model(tokenizer, model_args, finetuning_args) # must after fixing tokenizer to resize vocab
File "/root/autodl-tmp/LLaMA-Factory-main/src/llamafactory/model/loader.py", line 156, in load_model
model = load_class.from_pretrained(**init_kwargs)
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4003, in from_pretrained
hf_quantizer.postprocess_model(model)
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/quantizers/base.py", line 195, in postprocess_model
return self._process_model_after_weight_loading(model, **kwargs)
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py", line 85, in _process_model_after_weight_loading
self.optimum_quantizer.quantize_model(model, self.quantization_config.tokenizer)
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/optimum/gptq/quantizer.py", line 510, in quantize_model
scale, zero, g_idx = gptq[name].fasterquant(
File "/root/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/auto_gptq/quantization/gptq.py", line 118, in fasterquant
H = torch.linalg.cholesky(H, upper=True)
torch._C.LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 25942 is not positive-definite).
(xtuner-env) root@autodl-container-824b40a131-f3c15f0d:~/autodl-tmp/LLaMA-Factory-main#

经过搜索网络上相关报错信息:说的是校正数据样本数不足了。。。,但同样的LLamaFactory的工程版本 0.9.1.dev0,同样的微调脚本和合并脚本,我其实在恒源云的平台上运行过,是可以正确的量化出来的,这次只是迁移了一个平台,不知道为啥会报这样的一个错误。
PS:在了解到校正数据样本数不足的这个问题后,我尝试在quantization.py进行了这样一个修改:
...
#debug for quantization
maxlen = model_args.export_quantization_maxlen
model_args.export_quantization_nsamples = 512
logger.info("==========================================================================")
logger.info(f"maxlen: {maxlen}")
logger.info(f"nsamples: {model_args.export_quantization_nsamples}")

....
希望通过这个方式将校正样本数量增加到512(校正数据集换成了自己的、带有512条中文数据集),但是依然会报同样的错误。
我的问题是:

  1. 为什么同样的工程版本,数据集,训练脚本,不一样的平台(只是租赁云平台不一样,其他的PIP组件我都保持了一致),会报这种错误。
  2. 校正样本数量是解决这个问题是否是有帮助的?我看网上还有另一种办法:通过对damp_percent设置为0.1来解决
    https://github.com/AutoGPTQ/AutoGPTQ/pull/111#issuecomment-1624014218
    quantize_config = BaseQuantizeConfig(
    bits=4,
    group_size=128,
    desc_act=False,
    damp_percent=0.1
    )
    但本工程应该不是通过这个进行设置的,作者大佬@hiyouga能指点一下如果要设置参数damp_percent=0.1应该在哪儿进行么?

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant