[BUG] <Embedding Server解析过程中出错> #573

Lionelpang · 2024-11-25T05:55:53Z

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

启动embedding-server时，曝出以下问题，请问Embedding模型转onnx有其他说法么？还是自己转就可以呢？，求解决办法：

future: <Task finished name='Task-3' coro=<EmbeddingAsyncBackend.process_queue() done, defined at /Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py:66> exception=InvalidArgument('[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output')>
Traceback (most recent call last):
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 82, in process_queue
result = await loop.run_in_executor(self.executor, self.embed_documents, batch_texts)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/utils/general_utils.py", line 148, in get_time_inner
res = func(*arg, **kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 55, in embed_documents
outputs_onnx = self.session.run(output_names=['output'], input_feed=inputs_onnx)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output

期望行为 | Expected Behavior

采用的是bge-large-zh-v1.5模型自己转的onnx模型。

运行环境 | Environment

- OS:Mac os
- NVIDIA Driver:
- CUDA:
- Docker Compose:
- NVIDIA GPU Memory:

QAnything日志 | QAnything logs

Task exception was never retrieved
future: <Task finished name='Task-3' coro=<EmbeddingAsyncBackend.process_queue() done, defined at /Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py:66> exception=InvalidArgument('[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output')>
Traceback (most recent call last):
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 82, in process_queue
result = await loop.run_in_executor(self.executor, self.embed_documents, batch_texts)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/utils/general_utils.py", line 148, in get_time_inner
res = func(*arg, **kwargs)
File "/Users/lionel/Desktop/SourceCode/Other/AI/RAG/QAnything-2.0.0/qanything_kernel/dependent_server/embedding_server/embedding_async_backend.py", line 55, in embed_documents
outputs_onnx = self.session.run(output_names=['output'], input_feed=inputs_onnx)
File "/opt/miniconda3/envs/QAnything/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid output name:output

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

luckyxue · 2024-12-10T11:57:10Z

RPC error: [batch_insert], <ParamError: (code=1, message=Collection field dim is 768, but entities field dim is 11)>, <Time:{'RPC start': '2024-12-10 10:01:38.125233', 'RPC error': '2024-12-10 10:01:38.125584'}>

看起来像是 embedding生成的向量维度不一致

不是Embedding服务挂了就是milvus的问题

服务器测试一下test_embed.py

print("Best Configurations:")
print(
f"Best QPS: Batch Size {best_qps['batch_size']}, Threads {best_qps['num_threads']} (QPS: {best_qps['qps']:.2f})")
print(
f"Best Latency: Batch Size {best_latency['batch_size']}, Threads {best_latency['num_threads']} (Avg Latency: {best_latency['avg_latency'] * 1000:.2f} ms)")
print(
f"Best Memory Usage: Batch Size {best_memory['batch_size']}, Threads {best_memory['num_threads']} (Max Memory: {best_memory['max_memory_mb']:.2f} MB)")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] <Embedding Server解析过程中出错> #573

[BUG] <Embedding Server解析过程中出错> #573

Lionelpang commented Nov 25, 2024

luckyxue commented Dec 10, 2024

[BUG] <Embedding Server解析过程中出错> #573

[BUG] <Embedding Server解析过程中出错> #573

Comments

Lionelpang commented Nov 25, 2024

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

运行环境 | Environment

QAnything日志 | QAnything logs

复现方法 | Steps To Reproduce

备注 | Anything else?

luckyxue commented Dec 10, 2024