Replies: 2 comments 1 reply
-
I think this would likely need refit support. This is an example in native TensorRT https://github.com/NVIDIA/TensorRT/tree/release/9.0/demo/Diffusion#generate-an-image-guided-by-a-text-prompt-and-using-specified-lora-model-weight-updates. There may need to be some APIs exposed in torch-trt to work OOB |
Beta Was this translation helpful? Give feedback.
1 reply
-
For future notice there is now this API that helps make loras easier to use with Torch-TRT Models https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/mutable_torchtrt_module_example.py |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I have a base model and several LORA adapters trained on top of it. The base model will always be loaded and for each inference request I modify the model by applying an adapter. I want to optimize my model using TensorRT, is there a way to apply LORA adapters on the optimized TensorRT model?
I would appreciate any ideas on where I can start to work on this problem? Thank you.
Beta Was this translation helpful? Give feedback.
All reactions