-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Stable Diffusion 3.5 Large #2574
Comments
See #2578 |
This is working on A100. Takes too much memory for RTX4000 with 20GB. I see there is a quantized gguf is it possible currently to use this gguf quantized model? thanks. |
I see in the readme of SD3 there is a benchmark running a RTX 3090 Ti. how much memory does that card have. seams like 35 takes 40 plus GB to run in candle... |
great work. |
I've pushed some further changes in #2589 so that the f32 conversion is done on the flight rather than upfront so that we can benefit from the reduced memory usage while retaining full precision. After this, the memory usage I get from nsys during the text encoding step is done to ~10.5GB. That said, I still see the memory usage getting to ~20GB while running the mmdit so not that likely to fit on a 20GB gpu. |
Amazing how much space is saved with the F16 on the T5. And it's only using 17GB during the sampling! |
I tried updating the hf repo to 3.5 Large but its not working.
The text was updated successfully, but these errors were encountered: