-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak #13
Comments
This makes the assumption that the dataset will always fit in the memory. |
@aronhoff 's solution works like a charm! I had been struggled by this. |
Well, I've put it there for a reason, and it still ensures that you don't iterate through the entire dataset every time for most cases where the dataset isn't larger than the memory. I think parameterising is the best solution for now. |
I tried adding a bool attribute to the Unfortunately this does not work with multiprocessing. I do not currently have more time to find a way. Solution could be in having a custom Perhaps concerns should be separated completely. You could have a Keep in mind that the cache And with regards to the dataset, ImageNet would optimistically take around 200 GB :) |
Thanks for the PR, @shrubb! I have been quite busy, and haven't gotten time to look at it. I'll take a look soon and get back to you. |
Has this been resolved? :) |
Any solution? I just faced a memory leak with this as well! |
Another issue with memoizing the For complicated training cases, executing |
Face the same issue. Any workarounds? |
Hi,
when the below example is run, the RAM usage grows forever:
Notes:
666
withtorch.empty(10_000)
(be careful to kill the process in time, before you're OOM!).SafeDataset
.torch.utils.data.DataLoader
, the leak is still there, although at a smaller scale, around 1 MB of RAM is lost per 30000-40000__getitem__
calls.The text was updated successfully, but these errors were encountered: