-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(document_extractor): integrate unstructured API for PPTX extraction #10180
feat(document_extractor): integrate unstructured API for PPTX extraction #10180
Conversation
- Added support for using the unstructured API for PPTX text extraction when available. - Falls back to existing method if API credentials are not configured. - Ensures flexibility and potentially enhanced performance or accuracy in text extraction.
…m-vdb * 'lindorm-vdb' of github.com:AlwaysBluer/dify: (39 commits) Feat : add LLM model indicator in prompt generator (langgenius#10187) chore: enable vision support for models in OpenRouter that should have supported vision (langgenius#10191) chore : code generator preview hint (langgenius#10188) fix: webapp upload file (langgenius#10195) fix(api): replace current_user with end_user in file upload (langgenius#10194) feat(document_extractor): integrate unstructured API for PPTX extraction (langgenius#10180) fix(tools): suppress RuntimeWarnings in podcast audio generator (langgenius#10182) [fix] fix the bug that modify document name not effective (langgenius#10154) fix(workflow model): ensure consistent timestamp updating (langgenius#10172) fix: Cannot find declaration to go to CLEAN_DAY_SETTING (langgenius#10157) feat: add gpustack model provider (langgenius#10158) refactor(tools): Avoid warnings. (langgenius#10161) refactor(migration/model): update column types for workflow schema (langgenius#10160) Feat/add-remote-file-upload-api (langgenius#9906) fix: upload remote image preview (langgenius#9952) clean un-allowed special charters when doing indexing estimate (langgenius#10153) refactor(service): handle unsupported DSL version with warning (langgenius#10151) Add VESSL AI OpenAI API-compatible model provider and LLM model (langgenius#9474) feat: synchronize input/output variables in the panel with generated code by the code generator (langgenius#10150) Refined README for better reading experience. (langgenius#10143) ...
Thank you for the report, link to #10886 |
1The knowledge chunking works fine after the fix. fix10953 2However, it isn't functioning as expected for pptx extraction.
|
Could you please open an issue for this and share your version, dsl for us to reproduce this problem? |
Sure #10956 |
Checklist:
Important
Please review the checklist below before submitting your pull request.
dev/reformat
(backend) andcd web && npx lint-staged
(frontend) to appease the lint godsDescription
#9995
Type of Change
Testing Instructions
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration