40 VARCHAR Limit on Model Name #6615

mbbyn · 2024-07-24T06:57:42Z

Self Checks

This is only for bug report, if you would like to ask a question, please head to Discussions.
I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.14

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

First, add an OpenAI API compatible model with a long name.

Then, try to use the model in a Knowledge Base settings. Observe the failure and error logs.

✔️ Expected Behavior

It should work with long model names.

❌ Actual Behavior

It fails to save the model name since it's >40 VARCHAR.

sqlalchemy.exc.DataError: (psycopg2.errors.StringDataRightTruncation) value too long for type character varying(40)
dify_api.1
all values are hidden
dify_api.1
[SQL: INSERT INTO dataset_collection_bindings (provider_name, model_name, type, collection_name) VALUES (%(provider_name)s, %(model_name)s, %(type)s, %(collection_name)s) RETURNING dataset_collection_bindings.id, dataset_collection_bindings.created_at]
dify_api.1
[parameters: {'provider_name': 'openai_api_compatible', 'model_name': 'sentence-transformers/distiluse-base-multilingual-cased-v1', 'type': 'dataset', 'collection_name': 'Vector_index_df1864b1_1392_456d_ad1f_0ec6030969a3_Node'}]

mbbyn · 2024-07-24T07:02:08Z

For those interested, we are hosting HuggingFace Text Embedding Inference server, which exposes an OpenAI compatible API. In order to ask it to infer on a certain model, we pass the model name to the server, which are usually very long names.

mbbyn · 2024-07-24T07:07:26Z

Related #1857 @crazywoola

crazywoola · 2024-07-24T07:08:40Z

Actually, it's pretty easy to fix this, I will upgrade the migrations later.

mbbyn · 2024-07-25T16:21:01Z

I have tested this change, but the issue still persists, unfortunately. The PR updates provider_name in embeddings table, but we would want to update model_name in dataset_collection_bindings instead. It is also worth increading the limit of model_name in other tables, such as embeddings.

HiroshigeAoki · 2024-07-26T12:42:01Z

@mbbyn
I ran into the same issue. So I fixed it.

dosubot bot added the 🐞 bug Something isn't working label Jul 24, 2024

crazywoola self-assigned this Jul 24, 2024

crazywoola linked a pull request Jul 24, 2024 that will close this issue

Fix/6615 40 varchar limit on model name #6623

Merged

12 tasks

crazywoola mentioned this issue Jul 24, 2024

Fix/6615 40 varchar limit on model name #6623

Merged

12 tasks

laipz8200 closed this as completed in #6623 Jul 24, 2024

HiroshigeAoki mentioned this issue Jul 26, 2024

Fix/6615 40 varchar limit on DatasetCollectionBinding and Embedding model name #6723

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

40 VARCHAR Limit on Model Name #6615

40 VARCHAR Limit on Model Name #6615

mbbyn commented Jul 24, 2024

mbbyn commented Jul 24, 2024

mbbyn commented Jul 24, 2024

crazywoola commented Jul 24, 2024

mbbyn commented Jul 25, 2024

HiroshigeAoki commented Jul 26, 2024

40 VARCHAR Limit on Model Name #6615

40 VARCHAR Limit on Model Name #6615

Comments

mbbyn commented Jul 24, 2024

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

mbbyn commented Jul 24, 2024

mbbyn commented Jul 24, 2024

crazywoola commented Jul 24, 2024

mbbyn commented Jul 25, 2024

HiroshigeAoki commented Jul 26, 2024