Model Runtime #1858

takatost · 2024-01-01T19:04:24Z

🎉🎉 Dify's Version 0.4 is out now.

We've made some serious under-the-hood changes to how the Model Runtime works, making it more straightforward for our specific needs, and paving the way for smoother model expansions and more robust production use.

What's Changed:

Model Runtime Rework: We've moved away from LangChain, simplifying the model layer. Now, expanding models is as easy as setting up the model provider in the backend with a bit of YAML.

For more details, see: https://github.com/langgenius/dify/blob/feat/model-runtime/api/core/model_runtime/README.md
App Generation Update: Replacing the old Redis Pubsub queue with threading.Queue for a more reliable, performant, and straightforward workflow.
Model Providers Upgraded: Support for both preset and custom models, ideal for adding OpenAI fine-tuned models or fitting into various MaaS platforms. Plus, you can now check out supported models without any initial configuration.
Context Size Definition: Introduced distinct context size settings, separate from Max Tokens, to handle the different limits and sizes in models like OpenAI's GPT-4 Turbo.
Flexible Model Parameters: Customize your model's behavior with easily adjustable parameters through YAML.
GPT-2 Tokenizer Files: Now cached within Dify's codebase, making builds quicker and solving issues related to acquiring tokenizer files in offline source deployments.
Model List Display: The App now displays all supported preset models, including details on any that aren't available and how to configure them.
New Model Additions: Including Google's Gemini Pro and Gemini Pro Vision models (Vision requires an image input), Azure OpenAI's GPT-4V, and support for OpenAI-API-compatible providers.
Expanded Inference Support: Xorbit Inference now includes chat mode models, and there's a wider range of models supporting Agent inference.
Updates & Fixes: We've updated other model providers to be in sync with the latest version APIs and features, and squashed a series of minor bugs for a smoother experience.

Catch you in the code,

The Dify Team 🛠️

api/models/provider.py

santiagoblanco22 · 2024-01-03T01:27:55Z

Nice work!! It's possible add fine tuned models right now? from openAI?
Thanks for this work!!

sentry-io · 2024-01-15T16:13:52Z

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

‼️ TypeError: Object of type ModelType is not JSON serializable core.features.dataset_retrieval in retrieve View Issue

_{Did you find this useful? React with a 👍 or 👎}

takatost · 2024-01-23T17:26:02Z

Nice work!! It's possible add fine tuned models right now? from openAI? Thanks for this work!!

yep

Co-authored-by: StyleZhang <[email protected]> Co-authored-by: Garfield Dai <[email protected]> Co-authored-by: chenhe <[email protected]> Co-authored-by: jyong <[email protected]> Co-authored-by: Joel <[email protected]> Co-authored-by: Yeuoly <[email protected]>

takatost and others added 30 commits December 26, 2023 13:58

Merge branch 'feat/model-runtime' into deploy/dev

96f55e3

fix: model parameter model

410dddd

Merge branch 'feat/model-provider-based-on-runtime' into deploy/dev

d7a6b8f

fix bug

a0c2169

fix bug

07b408c

fix: content moderation

926f31b

Merge branch 'feat/model-provider-based-on-runtime' into deploy/dev

47cc11c

fix bugs

8af4d7b

Merge branch 'feat/model-runtime' into deploy/dev

99271ec

remove debug code

ca848a8

refactor azure openai.

1d34c2c

optimize

28ef34a

Merge branch 'feat/model-runtime' into deploy/dev

458a008

model selector

0249633

Merge branch 'feat/model-provider-based-on-runtime' into deploy/dev

08ab266

optimize default model

06c0403

optimize default model

bd647b0

fix bug

cc506df

Merge branch 'feat/model-runtime' into deploy/dev

1c50056

optimize default model

eb92647

Merge branch 'feat/model-runtime' into deploy/dev

cf75984

refactor azure openai.

28de28b

Merge branch 'feat/model-runtime' into deploy/dev

d49229d

hooks

b333189

add lru cache for rsa decrypt decoding

b4b7f47

Merge branch 'feat/model-runtime' into deploy/dev

031a1e9

optimize generate speed

2483135

Merge branch 'feat/model-runtime' into deploy/dev

0ca6b42

fix annotation

b72b304

Merge branch 'feat/model-runtime' into deploy/dev

67f2d74

takatost and others added 8 commits January 2, 2024 17:18

fix max token bug

71cb2e0

fix mode enum

e23f347

fix: spark tongyi zhipu

89c51e0

fix grpc/gevent conflict

1fe2bed

fix: tongyi llm & zhipu embedding

37e77f4

refactor.

d6ee7bd

feat: mock

6f65d12

fix: conflict

f1c8364

crazywoola reviewed Jan 2, 2024

View reviewed changes

api/models/provider.py Show resolved Hide resolved

Yeuoly and others added 14 commits January 2, 2024 19:43

fix: add default value to top k

b3b9c64

fix: miss test

745d349

feat: add github actions

bfd95a6

fix: actions

8671204

fix: remobe line breaker

98fdac3

ci: remove unit tests

cf77abc

optimize

39c441f

hosting config add edition logic

f7541a6

fix.

b3ca03a

fix output moderation

09c4dac

bump version to 0.4.0

de3f328

feat: add huggingface mock

0b6460b

fix: huggingface

015f439

remove gpt-2 cache in dockerfile

6540524

takatost merged commit d069c66 into main Jan 2, 2024
2 checks passed

takatost deleted the feat/model-runtime branch January 2, 2024 15:42

GarfieldDai mentioned this pull request Jan 3, 2024

The Dify's Azure OpenAI need add gpt-35-turbo-instruct base model #1779

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Runtime #1858

Model Runtime #1858

takatost commented Jan 1, 2024 •

edited

Loading

santiagoblanco22 commented Jan 3, 2024

sentry-io bot commented Jan 15, 2024

takatost commented Jan 23, 2024

Model Runtime #1858

Model Runtime #1858

Conversation

takatost commented Jan 1, 2024 • edited Loading

santiagoblanco22 commented Jan 3, 2024

sentry-io bot commented Jan 15, 2024

Suspect Issues

takatost commented Jan 23, 2024

takatost commented Jan 1, 2024 •

edited

Loading