Releases: modelscope/ms-swift
Releases · modelscope/ms-swift
v2.6.0
English Version
Models
- Support Qwen2.5 coder models
Feature
- Correct and support the new loss and gradient accumulation algorithm from transformers.trainer
中文版本
模型
- 支持千问coder系列模型
功能
- 支持新的transformers loss和GA计算算法,并修正了其中的bug
What's Changed
- fix gptq ovis quant (inputs_embeds) by @Jintao-Huang in #2378
- [TorchAcc] fix qwen2 for transformers>=4.45 by @baoleai in #2379
- fix trl transformers==4.46 compat by @Jintao-Huang in #2385
- fix deploy stream media_type by @Jintao-Huang in #2393
- fix_mplug_owl3_cut_shape by @Jintao-Huang in #2394
- fix swift deploy (lmdeploy stream) by @Jintao-Huang in #2397
- fix lmdeploy warning & gptq-int4 support by @Jintao-Huang in #2401
- support qwen2.5-coder by @Jintao-Huang in #2400
- fix qwen_vl npu by @Jintao-Huang in #2408
- fix model path by @yingdachen in #2410
- fix qwen_vl dpo by @Jintao-Huang in #2411
- Compat transformers 4.46.2 loss by @Jintao-Huang in #2413
- Fix molmo infer by @Jintao-Huang in #2419
- Fix dataset map by @Jintao-Huang in #2421
- Support qwen2 5 coder series model by @Jintao-Huang in #2422
- fix PPO by @hjh0119 in #2377
- fix docs by @Jintao-Huang in #2425
- compat transformers==4.47 by @Jintao-Huang in #2426
- fix internvl2 lmdeploy>=0.6 deploy by @Jintao-Huang in #2429
Full Changelog: v2.5.2...v2.6.0
v2.5.2
New Models:
- emu3-chat
- aya-expanse
- ministral-8b-inst-2410
New Datasets:
- llava-video-178k
- moviechat-1k-test
What's Changed
- fix system by @Jintao-Huang in #2307
- Support Llava-video-178k dataset, a video dataset by @DaozeZhang in #2309
- update molmo readme by @mi804 in #2313
- fix molmo device_map by @Jintao-Huang in #2311
- support emu3-chat by @mi804 in #2322
- Support Moviechat-1k-test dataset, a video datatset by @DaozeZhang in #2319
- fix custom vllm eval args by @Yunnglin in #2325
- fix enable_cache by @Jintao-Huang in #2332
- feat: Add aya models by @Aunali321 in #2335
- compat transformers 4.46 by @Jintao-Huang in #2341
- fix streaming by @tastelikefeet in #2350
- fix doc by @tastelikefeet in #2355
- fix internvl2 eval by @Jintao-Huang in #2362
- fix transformers>=4.46 loss by @Jintao-Huang in #2365
- Support Ministral-8b-inst-2410 by @DaozeZhang in #2371
- fix custom loss by @Jintao-Huang in #2374
New Contributors
- @Aunali321 made their first contribution in #2335
Full Changelog: v2.5.1...v2.5.2
v2.5.1
English Version
New Features:
- Support for RM for LLM and MLLM, as well as PPO for LLM.
New Models:
- molmo series
- mplug-owl3 1b/2b
- llama3.1-nemotron-70b-instruct
- deepseek-janus
中文版
新特性:
- 支持LLM和MLLM的RM, 以及LLM的PPO.
新模型:
- molmo系列
- mplug-owl3 1b/2b
- llama3.1-nemotron-70b-instruct
- deepseek-janus
What's Changed
- support reward modeling and ppo by @hjh0119 in #2093
- fix rescale_image by @tastelikefeet in #2223
- fix deploy timeout by @Jintao-Huang in #2230
- Fix qwen2 vl batch size by @Jintao-Huang in #2239
- Fix ovis1.6 infer by @Jintao-Huang in #2242
- fix publish by @Jintao-Huang in #2245
- fix qwen2vl video args by @Jintao-Huang in #2251
- Update FAQ by @slin000111 in #2252
- Support molmo series vlm by @mi804 in #2260
- fix sft system by @Jintao-Huang in #2262
- support mplug3 1b/2b by @Jintao-Huang in #2271
- Fix deploy openai by @Jintao-Huang in #2278
- fix vllm ignore suffix by @Jintao-Huang in #2287
- fix lora_target_modules in PPO by @hjh0119 in #2274
- fix quant blocks by @Jintao-Huang in #2292
- Support Llama3.1-nemotron-70b-inst-hf by @DaozeZhang in #2299
- fix ppo citest by @hjh0119 in #2302
- support deepseek-janus by @Jintao-Huang in #2300
- update molmo by @Jintao-Huang in #2305
New Contributors
Full Changelog: v2.5.0...v2.5.1
v2.5.0
English Version
New Features:
- Support for GPTQ & AWQ quantization of multimodal LLMs.
- Support for dynamic addition of gradient checkpointing in the ViT section to reduce memory consumption.
- Support for multimodal model pre-training.
New Models:
- llama3.2, llama3.2-vision series
- got-ocr2
- llama3.1-omni
- ovis1.6-gemma2
- pixtral-12b
- telechat2-115b
- mistral-small-inst-2409
New Datasets:
- egoschema
中文版
新特性:
- 支持多模态LLM的gptq&awq量化.
- 支持动态在vit部分增加gradient_checkpointing, 减少显存消耗.
- 支持多模态模型预训练.
新模型:
- llama3.2, llama3.2-vision系列
- got-ocr2
- llama3.1-omni
- ovis1.6-gemma2
- pixtral-12b
- telechat2-115b
- mistral-small-inst-2409
新数据集:
- egoschema
What's Changed
- fix win32 quote by @tastelikefeet in #2065
- Fix yi template by @Jintao-Huang in #2067
- fix rlhf zero3 by @Jintao-Huang in #2072
- Update qwen2-vl最佳实践.md by @Digital2Slave in #2058
- fix RLHF & max_length by @Jintao-Huang in #2075
- Support Mistral-small-inst-2409 by @DaozeZhang in #2077
- dynamic vit gradient_checkpointing by @Jintao-Huang in #2071
- fix qwen2.5 template by @Jintao-Huang in #2081
- fix multiprocess remove_columns by @Jintao-Huang in #2088
- Support for fine-tuning Pixtral-12B. by @Jintao-Huang in #2090
- fix vllm tokenizer by @Jintao-Huang in #2099
- Fix the issue with media_offset in owl3 when batch_size > 1. by @LukeForeverYoung in #2100
- fix deploy openai compat by @Jintao-Huang in #2101
- fix dataset preprocess by @Jintao-Huang in #2102
- fix cpu infer device_map by @Jintao-Huang in #2103
- fix infer device_map by @Jintao-Huang in #2105
- Support for fine-tuning Llama 3.1 Omni. by @Jintao-Huang in #2106
- support vllm & qwen2-vl video by @Jintao-Huang in #2110
- Fix qwen2-vl zero2/3 by @Jintao-Huang in #2114
- fix qwen2-audio by @Jintao-Huang in #2116
- [TorchAcc] fix: fix find_labels and can_return_loss by @baoleai in #2120
- support got-ocr2 by @Jintao-Huang in #2123
- Support for fine-tuning and deployment of the Llama 3.2 series models. by @Jintao-Huang in #2130
- Support fine-tuning MLLama. by @Jintao-Huang in #2132
- fix not impl bug by @Jintao-Huang in #2134
- Compat vllm & qwen2-vl by @Jintao-Huang in #2136
- fix requirements by @Jintao-Huang in #2137
- fix model_type by @Jintao-Huang in #2138
- fix deploy vllm by @Jintao-Huang in #2141
- fix docs by @Jintao-Huang in #2142
- Fix VLM lora by @tastelikefeet in #2140
- support mllm pt by @Jintao-Huang in #2146
- [TorchAcc] fix: fix save config and additional file for swift and peft by @baoleai in #2149
- update quant_device_map by @Jintao-Huang in #2154
- fix qwen2-audio by @Jintao-Huang in #2157
- fix template by @Jintao-Huang in #2160
- compat trl==0.11 by @Jintao-Huang in #2169
- Support for Egoschema, a new video dataset by @DaozeZhang in #2173
- Update FAQ by @slin000111 in #2165
- fix mplug-owl3 infer by @Jintao-Huang in #2175
- Support quant mllm by @Jintao-Huang in #2177
- update setup.py by @Jintao-Huang in #2205
- fix bugs by @Jintao-Huang in #2207
- support telechat2 by @Jintao-Huang in #2210
- Support ovis 1.6 by @Jintao-Huang in #2211
New Contributors
- @Digital2Slave made their first contribution in #2058
- @LukeForeverYoung made their first contribution in #2100
Full Changelog: v2.4.2...v2.5.0
v2.4.2
English Version
New Features:
- RLHF reconstruction, supporting all integrated multimodal models, compatible with DeepSpeed Zero2/Zero3, and supports lazy_tokenize.
- Using infer_backend vllm, inference deployment of multimodal large models supports multiple images.
New Models:
- Qwen2.5 series, Qwen2-vl-72b series (base/instruct/gptq-int4/gptq-int8/awq)
- Qwen2.5-math, Qwen2.5-coder series (base/instruct)
- Deepseek-v2.5
New Datasets:
- longwriter-6k-filtered
中文版
新特性:
- RLHF重构,支持所有已接入的多模态模型,兼容deepspeed zero2/zero3,支持lazy_tokenize
- 使用infer_backend vllm,推理部署多模态大模型支持多图.
新模型:
- qwen2.5系列、qwen2-vl-72b系列(base/instruct/gptq-int4/gptq-int8/awq)
- qwen2.5-math, qwen2.5-coder系列(base/instruct)
- deepseek-v2.5
新数据集:
- longwriter-6k-filtered
What's Changed
- fix model_mapping by @Jintao-Huang in #1982
- fix patch by @Jintao-Huang in #1997
- fix by @tastelikefeet in #1995
- Support Deepseek 2.5 by @DaozeZhang in #1992
- fix EngineGenerationConfig importError of lmdeploy by @irexyc in #1990
- compat lmdeploy==0.6 by @Jintao-Huang in #2001
- Fix rlhf ref model by @Jintao-Huang in #2003
- Support llava1.6-llama3.1-8b-instruct by @DaozeZhang in #2005
- fix lmdeploy qwen_vl by @Jintao-Huang in #2009
- Add FAQ Document by @slin000111 in #2013
- Florence use _post_encode & template support encoder-decoder by @Jintao-Huang in #2019
- refactor rlhf by @Jintao-Huang in #1975
- update code by @Jintao-Huang in #2028
- fix deploy eval kill by @Jintao-Huang in #2029
- Fix olora and pissa saving files which will cause the second saving failed by @tastelikefeet in #2032
- fix rlhf & zero3 by @Jintao-Huang in #2034
- Add longwriter filtered dataset by @wangxingjun778 in #2037
- fix mplug-owl3 by @Jintao-Huang in #2042
- support multi bbox grounding by @tastelikefeet in #2045
- Fix multi coordinate grounding by @tastelikefeet in #2047
- llama3 tool calling by @tastelikefeet in #2048
- update docs by @Jintao-Huang in #2050
- fix qwen2vl position_ids by @Jintao-Huang in #2051
- support qwen2-vl-base by @Jintao-Huang in #2052
- Support qwen2.5 by @Jintao-Huang in #2054
- support qwen2-vl -72b/qwen2.5-math/qwen2.5-coder by @Jintao-Huang in #2056
- vllm support mutli image by @Jintao-Huang in #2059
- support qwen2.5-coder by @Jintao-Huang in #2061
- fix notebook gradio by @Jintao-Huang in #2062
- update qwen2-vl docs by @Jintao-Huang in #2063
New Contributors
- @irexyc made their first contribution in #1990
- @wangxingjun778 made their first contribution in #2037
Full Changelog: v2.4.1...v2.4.2
v2.4.1
English Version
New Features:
- Inference and deployment support for logprobs.
- RLHF support for lazy_tokenize.
- Multimodal model support for neftune.
- dynamic_eos compatibility with glm4 series and other models.
New Models:
- mplug-owl3, best practices can be found here.
- yi-coder 1.5b, base/chat model of 9b.
- minicpm3-4b.
- reflection-llama3.1-70b.
中文版
新功能:
- 推理和部署支持 logprobs。
- RLHF支持lazy_tokenize。
- 多模态模型支持neftune。
- dynamic_eos兼容glm4系列等模型。
新模型:
- mplug-owl3,最佳实践可以查看这里。
- yi-coder 1.5b、9b 的base/chat模型。
- minicpm3-4b。
- reflection-llama3.1-70b。
What's Changed
- Fix push_to_hub when last-checkpoint by @tastelikefeet in #1897
- support custom quantized dataset by @tastelikefeet in #1893
- fix push_to_ms by @tastelikefeet in #1901
- support logprobs by @Jintao-Huang in #1900
- deepspeed use cosine lr_schduler by @Jintao-Huang in #1907
- update docs by @Jintao-Huang in #1908
- fix web-ui push to hub strategy by @tastelikefeet in #1909
- Refactor docs by @tastelikefeet in #1912
- refactor docs by @tastelikefeet in #1915
- [TorchAcc] perf: use xm.save instead of torch.save by @baoleai in #1916
- update wechat by @tastelikefeet in #1925
- update docs & fix bug by @Jintao-Huang in #1926
- [TorchAcc] fix: fix the judegement of fsdp_num by @baoleai in #1903
- Support deploy & logprobs by @Jintao-Huang in #1833
- fix typing by @Jintao-Huang in #1933
- fix swift deploy by @Jintao-Huang in #1936
- update yi-coder by @Jintao-Huang in #1942
- fix lmdeploy seed by @Jintao-Huang in #1945
- fix do_sample by @Jintao-Huang in #1946
- refactor rlhf by @hjh0119 in #1885
- fix file rename error in megatron when there are multi process by @Zhikaiiii in #1948
- fix qwen2-vl & video by @Jintao-Huang in #1950
- support dynamic_eos by @Jintao-Huang in #1947
- fix rlhf by @hjh0119 in #1949
- Support minicpm 3 by @Jintao-Huang in #1952
- Add lazy_tokenize to RLHF by @tastelikefeet in #1956
- Fix data info print in rlhf by @tastelikefeet in #1964
- Fix the lora hook by @tastelikefeet in #1963
- fix bugs by @Jintao-Huang in #1959
- support mplug_owl3 by @Jintao-Huang in #1957
- update docs by @Jintao-Huang in #1970
- Add reflection model by @tastelikefeet in #1973
- fix typo by @Jintao-Huang in #1980
Full Changelog: v2.4.0...v2.4.1
v2.4.0
English Version
New Features:
- Support for Liger, which accommodates models like LLaMA, Qwen, Mistral, etc., and reduces memory usage by 10% to 60%.
- Support for custom loss function training using a registration mechanism.
- Training now supports pushing models to ModelScope and HuggingFace.
- Support for the
freeze_vit
parameter to control the behavior of full parameter training for multimodal models.
New Models:
- Qwen2-VL series includes GPTQ/AWQ quantized models. For best practices, see here.
- InternVL2 AWQ quantized models.
New Datasets:
- qwen2-pro series
中文版
新特性:
- 支持 Liger训练LLaMA、Qwen、Mistral 等模型,内存使用降低 10% 至 60%。
- 支持使用注册机制进行自定义损失函数的训练。
- 训练支持将模型推送至 ModelScope 和 HuggingFace。
- 支持 freeze_vit 参数,以控制多模态模型全参数训练的行为。
新模型:
- Qwen2-VL 系列包括 GPTQ/AWQ 量化模型,最佳实践可以查看这里。
- InternVL2 AWQ 量化模型。
新数据集:
- qwen2-pro 系列
What's Changed
- compat with vllm==0.5.5 by @Jintao-Huang in #1812
- Support zero2 offload by @Jintao-Huang in #1814
- fix mp+ddp & resume_from_checkpoint by @Jintao-Huang in #1815
- fix preprocess_num_proc by @Jintao-Huang in #1818
- Support liger by @tastelikefeet in #1819
- fix dora deployment by @tastelikefeet in #1821
- Support register loss func by @Jintao-Huang in #1822
- use default-lora by @Jintao-Huang in #1823
- fix minicpm-v 2.6 infer device_map by @Jintao-Huang in #1832
- Fix code by @tastelikefeet in #1824
- fix inject by @tastelikefeet in #1835
- support qwen2-pro dataset by @Jintao-Huang in #1834
- add ddp_timeout parameter by @tastelikefeet in #1836
- fix internlm-xcomposer rlhf by @hjh0119 in #1838
- Support eval_nproc by @tastelikefeet in #1843
- support qwen2-vl by @Jintao-Huang in #1842
- Add internvl2 awq models by @tastelikefeet in #1846
- Fix some datasets for streaming by @tastelikefeet in #1848
- Fix Pissa and OLoRA by @tastelikefeet in #1852
- Support qwen2 vl grounding by @tastelikefeet in #1854
- support qwen2-vl & video finetune by @Jintao-Huang in #1849
- Update new datasets by @tastelikefeet in #1855
- update qwen2-vl docs by @Jintao-Huang in #1856
- update qwen2-vl docs by @Jintao-Huang in #1858
- fix qwen2-vl docs by @Jintao-Huang in #1861
- fix requirements by @Jintao-Huang in #1864
- update docs qwen2-vl by @Jintao-Huang in #1869
- Support faster data map by @tastelikefeet in #1871
- [TorchAcc] fix serveral bugs for torchacc FSDP. by @baoleai in #1872
- Add train record by @tastelikefeet in #1873
- Fix num_proc by @Jintao-Huang in #1874
- Fix neftune doc by @tastelikefeet in #1875
- add duet by @tastelikefeet in #1877
- use model.generation_config by @Jintao-Huang in #1850
- Support freeze vit by @Jintao-Huang in #1880
- support qwen2-vl gptq awq by @Jintao-Huang in #1884
- Refactor push_to_hub by @tastelikefeet in #1883
- Fix push to hub logic by @tastelikefeet in #1888
- add vllm lmdeploy benchmark by @Jintao-Huang in #1889
- Add some warnings and fix RLHF by @tastelikefeet in #1890
Full Changelog: v2.3.2...v2.4.0
v2.3.2
English Version
New Features:
- ReFT support: achieves parameter efficiency that is 15× to 65× greater than LoRA.
- Multimodal model supports zero3.
- Supports using environment variables to control parameters such as hd_num, max_num, and video_segments.
New Models:
- longwriter-glm4-9b, longwriter-llama3_1-8b
- phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct
- llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov
New Datasets:
- longwriter-6k
- rlaif-v
- latex-ocr-print, latex-ocr-handwrite
中文版
新功能:
- 支持ReFT,实现了比 LoRA 高 15 倍到 65 倍的参数效率。
- 多模态模型支持 zero3。
- 支持使用环境变量控制模型特有的参数,如 hd_num、max_num 和 video_segments。
新模型:
- longwriter-glm4-9b, longwriter-llama3_1-8b
- phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct
- llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov
新数据集:
- longwriter-6k
- rlaif-v
- latex-ocr-print, latex-ocr-handwrite
What's Changed
- fix imports by @tastelikefeet in #1748
- compat with torch=1.12/1.13 by @Jintao-Huang in #1752
- update rlaif-v hf dataset by @hjh0119 in #1755
- fix lmdeploy: AssertionError: failed to match chat template, please explicit set chat_template_config by @Jintao-Huang in #1759
- use eager -> sdpa by @Jintao-Huang in #1764
- Fix GLM4 agent toolcall by @tastelikefeet in #1767
- Support LongWriter-llama3.1-8b and LongWriter-glm4-9b. by @DaozeZhang in #1762
- Support llava onevision by @Jintao-Huang in #1761
- [TorchAcc] fix: fix saving and loading checkpoint for full sft FSDP by @baoleai in #1765
- Fix deepseek-coder-v2-lite template by @Jintao-Huang in #1771
- Fix qwen2-audio & zero3 by @Jintao-Huang in #1774
- Fix zero3 & minicpm-v/internvl2/xcomposer by @Jintao-Huang in #1772
- fix infer dataset_test_ratio by @Jintao-Huang in #1779
- fix moe & gradient_checkpointing by @Jintao-Huang in #1782
- support phi3.5-vision by @Jintao-Huang in #1780
- ReFT by @tastelikefeet in #1785
- update doc by @tastelikefeet in #1789
- support qwen-vl & base64 by @Jintao-Huang in #1790
- fix yi-vl template by @Jintao-Huang in #1793
- fix bugs by @Jintao-Huang in #1794
- fix imports by @tastelikefeet in #1796
- fix history_roles by @Jintao-Huang in #1798
- fix mllm rlhf with full sft type by @hjh0119 in #1800
- fix CI by @tastelikefeet in #1797
- fix megatron_patch_path by @wning13 in #1804
- Support hd num by @Jintao-Huang in #1801
- Support Latex OCR dataset by @Jintao-Huang in #1810
- fix offline export by @wning13 in #1805
- fix by @tastelikefeet in #1811
New Contributors
Full Changelog: v2.3.1...v2.3.2
v2.3.1
English Version
New Features:
- ms-swift paper published: https://arxiv.org/abs/2408.05517
- Web-UI supports audio and video.
- Support for deploying audio and video models using the OpenAI API.
- Utilizes a new multimodal training framework.
- supports inference acceleration for video models (lmdeploy & internvl2 series).
New Models:
- idefics3-8b-llama3
- llava-hf 72b, 110b, llama3-llava
- deepseek-coder-v2, deepseek-coder-lite-v2, deepseek-v2
中文版
新功能:
- 发布了 ms-swift 论文:https://arxiv.org/abs/2408.05517
- Web-UI 支持音频和视频。
- 支持使用 OpenAI API 部署音频和视频模型。
- 采用新的多模态训练框架。
- 支持视频模型的推理加速(lmdeploy 和 internvl2 系列)。
新模型:
- idefics3-8b-llama3
- llava-hf 72b、110b、llama3-llava
- deepseek-coder-v2、deepseek-coder-lite-v2、deepseek-v2
What's Changed
- support video deploy by @Jintao-Huang in #1632
- fix infer deploy by @Jintao-Huang in #1665
- Fix bugs by @tastelikefeet in #1664
- Add generation config support for swift eval by @Yunnglin in #1659
- Add a bunch of small features by @tastelikefeet in #1669
- Support lmdeploy&vllm support video by @Jintao-Huang in #1668
- [TorchAcc] fix: add mark_step when GA is greater than 1 by @baoleai in #1677
- Support audio and video in web-ui by @tastelikefeet in #1681
- Fix lmdeploy deploy seed by @Jintao-Huang in #1678
- add paper link by @tastelikefeet in #1684
- set default writer_batch_size argument in mllm rlhf by @hjh0119 in #1689
- New multimodal framework by @Jintao-Huang in #1594
- fix video component by @tastelikefeet in #1692
- refactor internvl2 by @Jintao-Huang in #1625
- fix model dtype in rlhf by @hjh0119 in #1696
- Fix minicpmv 2.6 & video/plain-text bug by @Jintao-Huang in #1698
- Fix internvl2 device map by @Jintao-Huang in #1701
- Support deepseek-v2 base by @Jintao-Huang in #1708
- [TorchAcc] update data bucketing strategy by @baoleai in #1706
- [TorchAcc] consolidate FSDP ckpt after saving instead of during loading by @baoleai in #1709
- fix deploy stream by @Jintao-Huang in #1713
- update docs by @Jintao-Huang in #1714
- Fix mllm rlhf by @hjh0119 in #1707
- support idefics3 by @Jintao-Huang in #1720
- Fix minicpm-v-v2.6 device_map='auto' & agent deploy by @Jintao-Huang in #1733
- Fix kto by @hjh0119 in #1737
- support llava-hf 72b, 110b, llama3-llava-hf by @Jintao-Huang in #1741
- Fix glm4v rlhf by @hjh0119 in #1745
New Contributors
Full Changelog: v2.3.0...v2.3.1
v2.3.0
English Version
New Features
- Support for readthedocs documentation site at: https://swift.readthedocs.io/en/latest
- Support Megatron architecture training for QianWen series models, and added new
pt
command for pretraining. See docs: https://swift.readthedocs.io/en/latest/LLM/Megatron-training.html - Support LMDeploy for inference and deployment, improving inference acceleration for multi-modal models. See: https://swift.readthedocs.io/en/latest/Multi-Modal/LmDeploy-inference-acceleration.html
- Support passing lora target modules via regular expressions
- Support configuring max_memory usage for each GPU in device_map
export
command supports BitsAndBytes quantizationexport
command supports Ollama export: https://swift.readthedocs.io/en/latest/LLM/OLLaMA-Export.html- Support Q-GaLore algorithm
- Support RLHF training for multi-modal models: https://swift.readthedocs.io/en/latest/Multi-Modal/human-preference-alignment-training-documentation.html
- Support evaluation on 100+ datasets for multi-modal models: https://swift.readthedocs.io/en/latest/LLM/LLM-eval.html
- Support resizing input images when memory usage is too high for multi-modal models
- Modified default lora injection for multi-modal model training. Now takes effect on LLM and projector, results are better without significantly increasing training memory.
- Support PEFT 0.12, and added new tuner: fourierft
- Support rope-scaling for multi-modal models
- Support streaming processing of datasets to reduce memory usage, enable with
--streaming
- Support vLLM multi-modal inference and deployment
- Support grounding task for popular multi-modal models.
New Models
- qwen2-audio series
- qwen2-math
- codegeex4
- internvl2 series
- llava video
- xcomposer2.5
- cogvlm2-video
- numina-math
- mistral-nemo
- llama3.1 series
- mistral-large
- gemma-2-2b
- internlm2.5 1.8b 20b
- minicpm-v-v2_6-chat
Check: https://swift.readthedocs.io/en/latest/LLM/Supported-models-datasets.html
New Datasets
- zhihu-kol and zhihu-kol-filtered
- SA1B series multi-modal zh datasets
Check: https://swift.readthedocs.io/en/latest/LLM/Supported-models-datasets.html
中文版本
新功能
- 支持readthedocs文档库, 地址:https://swift.readthedocs.io/zh-cn/latest
- 支持千问系列模型的Megatron结构训练,并支持了新的pt命令用于预训练,详见文档:https://swift.readthedocs.io/zh-cn/latest/LLM/Megatron%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.html
- 支持LMDeploy的推理和部署,更好地支持了多模态模型的推理加速,详见:https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/LmDeploy%E6%8E%A8%E7%90%86%E5%8A%A0%E9%80%9F%E6%96%87%E6%A1%A3.html
- 支持以正则表达式方式传入lora target模块
- 支持配置device_map各GPU用量的max_memory
- export命令支持BitsAndBytes量化
- export命令支持Ollama导出:https://swift.readthedocs.io/zh-cn/latest/LLM/OLLAMA%E5%AF%BC%E5%87%BA%E6%96%87%E6%A1%A3.html
- 支持Q-GaLore算法
- 支持多模态模型的RLHF训练:https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/%E4%BA%BA%E7%B1%BB%E5%81%8F%E5%A5%BD%E5%AF%B9%E9%BD%90%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.html
- 支持多模态模型100+数据集的评测能力:https://swift.readthedocs.io/zh-cn/latest/LLM/LLM%E8%AF%84%E6%B5%8B%E6%96%87%E6%A1%A3.html
- 支持多模态模型显存占用过高时对输入图片进行缩放
- 修改了多模态模型训练的默认lora注入,目前对LLM和projector生效,不显著提高训练显存情况下效果更好
- 支持PEFT0.12,并支持了新的tuner:fourierft
- 支持多模态模型的rope-scaling
- 支持数据集的流式处理,降低显存消耗,使用--streaming开启
- 支持了vLLM的多模态推理部署能力
- 对部分多模态模型支持了grounding任务
新模型
- qwen2-audio系列模型
- qwen2-math
- codegeex4
- internvl2系列模型
- llava video
- xcomposer2.5
- cogvlm2-video
- numina-math
- mistral-nemo
- llama3.1系列
- mistral-large
- gemma-2-2b
- internlm2.5 1.8b 20b
- minicpm-v-v2_6-chat
新数据集
- zhihu-kol和zhihu-kol-filtered数据集
- SA1B系列中文多模态数据集
What's Changed
- fix dependency by @tastelikefeet in #1306
- support codegeex4 by @Jintao-Huang in #1305
- support internvl2 by @hjh0119 in #1304
- support llava video by @Jintao-Huang in #1307
- fix docs by @Jintao-Huang in #1309
- support lr_scheduler_kwargs by @Jintao-Huang in #1310
- Fix internvl2 template by @hjh0119 in #1308
- Fix bugs by @tastelikefeet in #1311
- support warmup_stable_decay by @Jintao-Huang in #1312
- Support xcomposer2.5 by @Jintao-Huang in #1287
- Fix bugs by @tastelikefeet in #1319
- fix bug by @tastelikefeet in #1320
- fix template by @Jintao-Huang in #1321
- support cogvlm2-video by @Jintao-Huang in #1318
- Fix bugs by @tastelikefeet in #1325
- fix web-ui by @tastelikefeet in #1327
- compatible with trl 0.9.6 by @hjh0119 in #1326
- compat with vllm==0.5.1 by @Jintao-Huang in #1329
- Update qrcode by @tastelikefeet in #1332
- fix florence model by @hjh0119 in #1334
- Relaxing requirements for trl by @hjh0119 in #1342
- fix xcomposer2.5 device_map by @Jintao-Huang in #1343
- support generation_info by @Jintao-Huang in #1344
- fix requirements by @Jintao-Huang in #1347
- readthedocs by @tastelikefeet in #1345
- fix sequence parallel get labels by @tastelikefeet in #1352
- fix filelock by @tastelikefeet in #1354
- Add pt command by @tastelikefeet in #1356
- fix generation_info efficiency by @Jintao-Huang in #1359
- fix sh ddp_backend by @Jintao-Huang in #1360
- support LLM & lmdeploy by @Jintao-Huang in #1272
- fix a file path by @tastelikefeet in #1363
- Internvl2 support video by @tastelikefeet in #1366
- fix openai api by @Jintao-Huang in #1367
- fix internvl2-40b by @hjh0119 in #1369
- fix vlm deploy lora & agent by @Jintao-Huang in #1371
- Support lora regex by @tastelikefeet in #1375
- Fix docs by @Jintao-Huang in #1380
- Fix FSDP; Add training percentage to jsonl logging; Add a web-ui component by @tastelikefeet in #1381
- Support max memory args by @tastelikefeet in #1382
- fix max_memory by @tastelikefeet in #1383
- Fix gpu assert calculation by @tastelikefeet in #1384
- fix dataset_sample & deploy stop_words by @Jintao-Huang in #1385
- fix internvl doc by @hjh0119 in #1394
- Fix link by @Jintao-Huang in #1397
- fix vllm==0.5.1 by @Jintao-Huang in #1404
- [TorchAcc] update accelerate API and add llama3-70B by @baoleai in #1400
- Support Ollama and BNB for export by @tastelikefeet in #1407
- Fix glm4v merge lora by @Jintao-Huang in #1410
- [TorchAcc] fix model download when using TorchAcc distributed training by @baoleai in #1408
- Support padding left by @tastelikefeet in #1414
- Fix ollama export by @tastelikefeet in #1416
- fix web-ui params by @tastelikefeet in #1417
- fix hub_token by @Jintao-Huang in #1420
- Update ms hub token by @Jintao-Huang in #1424
- Add numina math model by @tastelikefeet in #1421
- fix internvl template by @Jintao-Huang in #1433
- Internvl series models update by @hjh0119 in #1426
- fix internvl2 template by @Jintao-Huang in #1436
- Fix bug and make lazydataset more stable by @tastelikefeet in #1438
- Fix llava-hf by @tastelikefeet in #1439
- [WIP]Support Q-Galore by @tastelikefeet in #1440
-
- support deepspeed on ui 2. add tools to client_utils by @tastelikefeet in #1446
- fix r...