Support llama3.2 vision #5555

marko1616 · 2024-09-26T13:46:28Z

🚀 What does this PR do?

Support Llama-3.2-11B-Vision.

✅ Before submitting

Did you read the contributor guideline?
Did you write any new necessary tests?

🔗 Linked issues

Fixes #5549
Fixes #5796

⚠️ IMPORTANT

bitsandbytes 8 bits quantization is not functional. 4 bits is okay but not 8 bits.

src/llamafactory/model/loader.py

src/llamafactory/data/mm_plugin.py

hiyouga · 2024-09-26T14:21:57Z

src/llamafactory/data/mm_plugin.py

+            images = [Image.open(image) if isinstance(image, str) else image for image in images]
+            image_features = processor.image_processor(images)
+            _ = image_features.pop("num_tiles")
+        image_features = {k: v if isinstance(v, torch.Tensor) else torch.tensor(v) for k, v in image_features.items()}


we need to add cross attention masks https://github.com/huggingface/transformers/blob/0a21381ba3047882ffe1b95c639aec28974b2c7e/src/transformers/models/mllama/processing_mllama.py#L322-L333

That is because we can't get text at get_mm_inputs how do you think to fix this? Like add a new stage or add text input to get_mm_inputs.

yep, we should do some work here

src/llamafactory/data/template.py

src/llamafactory/data/mm_plugin.py

hiyouga

LGTM

hiyouga · 2024-11-23T18:48:57Z

Verified on Llama3.2 11B vision instruct

marko1616 · 2024-11-23T18:50:27Z

Ah, Thank for complete this.

hiyouga · 2024-11-23T18:56:13Z

It requires ~24GB for fine-tuning

hiyouga reviewed Sep 26, 2024

View reviewed changes

src/llamafactory/model/loader.py Outdated Show resolved Hide resolved

marko1616 force-pushed the feat/llama3.2vl branch from a3a23ac to b0e6f98 Compare September 26, 2024 14:17

hiyouga reviewed Sep 26, 2024

View reviewed changes

src/llamafactory/data/mm_plugin.py Outdated Show resolved Hide resolved

hiyouga reviewed Sep 26, 2024

View reviewed changes

src/llamafactory/data/template.py Outdated Show resolved Hide resolved

src/llamafactory/data/mm_plugin.py Outdated Show resolved Hide resolved

marko1616 changed the title ~~Support llama3.2vl.~~ Support llama3.2vl(WIP). Sep 26, 2024

hiyouga added the pending This problem is yet to be addressed label Sep 29, 2024

marko1616 marked this pull request as draft October 7, 2024 08:40

hiyouga mentioned this pull request Oct 23, 2024

请问现在支持 Llama-3.2-11B-Vision-Instruct 吗？ #5796

Closed

hiyouga force-pushed the main branch from 5569125 to b4c7dd3 Compare October 29, 2024 07:32

marko1616 added 3 commits November 23, 2024 16:07

Support llama3.2vl.

3f2c056

Tiny fix.

8372c5e

Linter.

b1e43e5

hiyouga force-pushed the feat/llama3.2vl branch from e637498 to b1e43e5 Compare November 23, 2024 16:09

hiyouga marked this pull request as ready for review November 23, 2024 16:09

fix inputs

446441f

hiyouga changed the title ~~Support llama3.2vl(WIP).~~ Support llama3.2 vision Nov 23, 2024

hiyouga self-requested a review November 23, 2024 18:26

add forbidden modules

df47737

hiyouga approved these changes Nov 23, 2024

View reviewed changes

hiyouga temporarily deployed to tests November 23, 2024 18:39 — with GitHub Actions Inactive

hiyouga temporarily deployed to tests November 23, 2024 18:40 — with GitHub Actions Inactive

hiyouga merged commit e68ef89 into hiyouga:main Nov 23, 2024
12 checks passed

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Nov 23, 2024

marko1616 deleted the feat/llama3.2vl branch November 23, 2024 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support llama3.2 vision #5555

Support llama3.2 vision #5555

marko1616 commented Sep 26, 2024 •

edited by hiyouga

Loading

hiyouga Sep 26, 2024

marko1616 Sep 26, 2024 •

edited

Loading

hiyouga Sep 26, 2024

hiyouga left a comment

hiyouga commented Nov 23, 2024

marko1616 commented Nov 23, 2024

hiyouga commented Nov 23, 2024

Support llama3.2 vision #5555

Support llama3.2 vision #5555

Conversation

marko1616 commented Sep 26, 2024 • edited by hiyouga Loading

🚀 What does this PR do?

✅ Before submitting

🔗 Linked issues

⚠️ IMPORTANT

hiyouga Sep 26, 2024

Choose a reason for hiding this comment

marko1616 Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

hiyouga Sep 26, 2024

Choose a reason for hiding this comment

hiyouga left a comment

Choose a reason for hiding this comment

hiyouga commented Nov 23, 2024

marko1616 commented Nov 23, 2024

hiyouga commented Nov 23, 2024

marko1616 commented Sep 26, 2024 •

edited by hiyouga

Loading

marko1616 Sep 26, 2024 •

edited

Loading