Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support LLM process document file #10966

Merged
merged 11 commits into from
Nov 22, 2024
Merged

Conversation

hjlarry
Copy link
Contributor

@hjlarry hjlarry commented Nov 22, 2024

Summary

Tip

Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Currently, lots of LLM( gemini, sonnet ...) can directly process document, and make user's chat context based on these documents. This PR aimed to support this feature in a dify agent app. For the chatflow app, maybe this PR can resolve.

ChangeList

Backend

Frontend

  • add a switch button to control whether allow upload document
  • for not support video feature's LLM, open vision config will not allow upload video
  • chat page whether display file upload button depends on file.allowed_file_types has any value

remaining issues

  • the vision settings is strange, Resolution only affect Image type files, Upload Method and Upload Limit affect all type files.

Screenshots

image

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@hjlarry hjlarry marked this pull request as ready for review November 22, 2024 06:04
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. ⚙️ feat:model-runtime 💪 enhancement New feature or request labels Nov 22, 2024
@laipz8200
Copy link
Member

Thank you for this awesome contribution! Some changes in #10679 are still in processing, I'll review this PR after #10679 is merged.

@laipz8200
Copy link
Member

the vision settings is strange, Resolution only affect Image type files, Upload Method and Upload Limit affect all type files.

I think this resolution config should be removed in the future.

@laipz8200
Copy link
Member

Hi @hjlarry! #10679 is merged, could you please sync the code with the main branch?

@hjlarry
Copy link
Contributor Author

hjlarry commented Nov 22, 2024

Hi @hjlarry! #10679 is merged, could you please sync the code with the main branch?

Done :)

@laipz8200
Copy link
Member

Screen.Recording.2024-11-22.at.5.56.05.PM.mov

@hjlarry
Copy link
Contributor Author

hjlarry commented Nov 22, 2024

Screen.Recording.2024-11-22.at.5.56.05.PM.mov

seems the icon has been overwrite by the merge action, please try again

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 22, 2024
@laipz8200 laipz8200 merged commit 08ac368 into langgenius:main Nov 22, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request ⚙️ feat:model-runtime lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants