Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] <Embedding Server解析过程中出错> #573

Open
2 tasks done
Lionelpang opened this issue Nov 25, 2024 · 1 comment
Open
2 tasks done

[BUG] <Embedding Server解析过程中出错> #573

Lionelpang opened this issue Nov 25, 2024 · 1 comment

Comments

@Lionelpang
Copy link

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

启动embedding-server时,曝出以下问题,请问Embedding模型转onnx有其他说法么?还是自己转就可以呢?,求解决办法:

future: <Task finished name='Task-3' coro=<EmbeddingAsyncBackend.process_queue() done, defined at /Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py:66> exception=InvalidArgument('[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output')>
Traceback (most recent call last):
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 82, in process_queue
result = await loop.run_in_executor(self.executor, self.embed_documents, batch_texts)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/utils/general_utils.py", line 148, in get_time_inner
res = func(*arg, **kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 55, in embed_documents
outputs_onnx = self.session.run(output_names=['output'], input_feed=inputs_onnx)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output

期望行为 | Expected Behavior

采用的是bge-large-zh-v1.5模型自己转的onnx模型。

运行环境 | Environment

- OS:Mac os
- NVIDIA Driver:
- CUDA:
- Docker Compose:
- NVIDIA GPU Memory:

QAnything日志 | QAnything logs

Task exception was never retrieved
future: <Task finished name='Task-3' coro=<EmbeddingAsyncBackend.process_queue() done, defined at /Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py:66> exception=InvalidArgument('[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output')>
Traceback (most recent call last):
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 82, in process_queue
result = await loop.run_in_executor(self.executor, self.embed_documents, batch_texts)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/utils/general_utils.py", line 148, in get_time_inner
res = func(*arg, **kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 55, in embed_documents
outputs_onnx = self.session.run(output_names=['output'], input_feed=inputs_onnx)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

@luckyxue
Copy link

RPC error: [batch_insert], <ParamError: (code=1, message=Collection field dim is 768, but entities field dim is 11)>, <Time:{'RPC start': '2024-12-10 10:01:38.125233', 'RPC error': '2024-12-10 10:01:38.125584'}>

看起来像是 embedding生成的向量维度不一致

不是Embedding服务挂了就是milvus的问题

服务器测试一下test_embed.py

print("Best Configurations:")
print(
f"Best QPS: Batch Size {best_qps['batch_size']}, Threads {best_qps['num_threads']} (QPS: {best_qps['qps']:.2f})")
print(
f"Best Latency: Batch Size {best_latency['batch_size']}, Threads {best_latency['num_threads']} (Avg Latency: {best_latency['avg_latency'] * 1000:.2f} ms)")
print(
f"Best Memory Usage: Batch Size {best_memory['batch_size']}, Threads {best_memory['num_threads']} (Max Memory: {best_memory['max_memory_mb']:.2f} MB)")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants