-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Challenges in Memorizing Single or Few Images #49
Comments
I never did an experiment to memorize data. A few things to check before any conclusion:
|
Hi Tianhong, Thank you for your prompt response. I am currently studying the generalization properties of image/video generative methods and was particularly curious about how these models scale with the size of effective training data. To investigate this on MAR, I ran one of the experiments to explore whether methods like MAR, MASKGIT, or others can successfully memorize and regenerate a single image. For this purpose, I replicated the data to match a batch size of 512, increased the diffusion batch multiplier, and turned off the EMA. As a result, I achieved a near-zero loss (~0.009) it takes a long time for the model to start decreasing from the max FID (~600). Do you have any insights or elaboration on why this behaviour might occur? I plan to run more experiments to understand such models more and their generalization capabilities and will keep you updated with any further findings. |
If you only have 1 data, then the FID will definitely be very high. Have you visualized your generated image and see whether it matches the single image you provided? |
Recently, I attempted to train the model on a different domain and decided to start with a simple experiment: testing whether the model could memorize a single image and then sample it correctly.
Here’s what I did:
diffusion_batch_mul
to 256 to facilitate assigning the data points to the appropriate output more easily.Despite these adjustments, the model didn’t work as expected. Initially, I thought the issue might be related to the new domain, suspecting that the VAE might not be tokenizing the data properly. However, I encountered the same issue when trying to memorize two images in the ImageNet dataset as well.
Do you have any intuition or thoughts on why this might be happening?
The text was updated successfully, but these errors were encountered: