Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Runtime #1858

Merged
merged 1,324 commits into from
Jan 2, 2024
Merged

Model Runtime #1858

merged 1,324 commits into from
Jan 2, 2024

Conversation

takatost
Copy link
Collaborator

@takatost takatost commented Jan 1, 2024

🎉🎉 Dify's Version 0.4 is out now.

We've made some serious under-the-hood changes to how the Model Runtime works, making it more straightforward for our specific needs, and paving the way for smoother model expansions and more robust production use.

What's Changed:

  • Model Runtime Rework: We've moved away from LangChain, simplifying the model layer. Now, expanding models is as easy as setting up the model provider in the backend with a bit of YAML.

    For more details, see: https://github.com/langgenius/dify/blob/feat/model-runtime/api/core/model_runtime/README.md

  • App Generation Update: Replacing the old Redis Pubsub queue with threading.Queue for a more reliable, performant, and straightforward workflow.

  • Model Providers Upgraded: Support for both preset and custom models, ideal for adding OpenAI fine-tuned models or fitting into various MaaS platforms. Plus, you can now check out supported models without any initial configuration.

  • Context Size Definition: Introduced distinct context size settings, separate from Max Tokens, to handle the different limits and sizes in models like OpenAI's GPT-4 Turbo.

  • Flexible Model Parameters: Customize your model's behavior with easily adjustable parameters through YAML.

  • GPT-2 Tokenizer Files: Now cached within Dify's codebase, making builds quicker and solving issues related to acquiring tokenizer files in offline source deployments.

  • Model List Display: The App now displays all supported preset models, including details on any that aren't available and how to configure them.

  • New Model Additions: Including Google's Gemini Pro and Gemini Pro Vision models (Vision requires an image input), Azure OpenAI's GPT-4V, and support for OpenAI-API-compatible providers.

  • Expanded Inference Support: Xorbit Inference now includes chat mode models, and there's a wider range of models supporting Agent inference.

  • Updates & Fixes: We've updated other model providers to be in sync with the latest version APIs and features, and squashed a series of minor bugs for a smoother experience.

Catch you in the code,

The Dify Team 🛠️

@takatost takatost merged commit d069c66 into main Jan 2, 2024
2 checks passed
@takatost takatost deleted the feat/model-runtime branch January 2, 2024 15:42
@santiagoblanco22
Copy link

Nice work!! It's possible add fine tuned models right now? from openAI?
Thanks for this work!!

Copy link

sentry-io bot commented Jan 15, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ TypeError: Object of type ModelType is not JSON serializable core.features.dataset_retrieval in retrieve View Issue

Did you find this useful? React with a 👍 or 👎

@takatost
Copy link
Collaborator Author

Nice work!! It's possible add fine tuned models right now? from openAI? Thanks for this work!!

yep

HuberyHuV1 pushed a commit to HuberyHuV1/dify that referenced this pull request Jul 22, 2024
Co-authored-by: StyleZhang <[email protected]>
Co-authored-by: Garfield Dai <[email protected]>
Co-authored-by: chenhe <[email protected]>
Co-authored-by: jyong <[email protected]>
Co-authored-by: Joel <[email protected]>
Co-authored-by: Yeuoly <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants