-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
知识库检索速度很慢,是不是向量和重拍没有使用GPU #561
Comments
注:preprocess: 0.02s + condense_q_chain: 0.00s + retriever_search: 0.29s + web_search: 0.00s + rerank: 20.81s + reprocess: 0.01s + llm_first_return: 2.89s = first_return: 24.14s + llm_completed:2.18s + obtain_images_time: 1.09s = chat_completed:27.41s |
rerank: 20.81s 这个模型耗时最长。 |
我用的是deepseek的api,然后用的是openai gpu的那个脚本。以为是gpu上做检索,结果发现没啥区别。。。。 |
你去看看你的pytorch版本是不是GPU版本 |
感觉将环境的onnx runtime改为onnx runtime-gpu后将这两个模型的启动方式都改成gpu,就可以在gpu上运行了。这两个模型的异步运行后端代码是不是可以直接用 |
老版本跑在gpu上没有问题,2.0只能运行在cpu上慢的要死,怀疑是故意这样设定,非常恶心 |
尝试了,在原有镜像上添加CUDA和cudnn后,用他那个异步的embedding和rerank代码就很快,并发也不错。感觉就是故意的哈哈哈哈 |
我最近也在研究这个,能向您请教一下具体实现的步骤和设置吗? |
你好,能请教下怎么做的吗 |
请教下这个具体是怎么修改设置? |
知识库检索速度很慢,大概每次20s,是不是向量和重排没有使用GPU
The text was updated successfully, but these errors were encountered: