-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Training Loss #37
Comments
500 epochs on CIFAR-10 might not be enough. Since the ema is 0.9999, the model with ema needs around 100k iterations to generate a reasonable image. The model without ema still needs around 50k iterations. Also, please check 1. whether you have normalized the tokens according to your new autoencoder (the current normalization is 0.2325, which is specific to our ImageNet tokenizer) 2. you can use 1000 diffusion steps instead 100 to see whether the large value is because of the diffusion process |
It means i need to train MAR on cifar10 at least for 100k epochs?? 😱(ema=0.9999) |
No -- 100k iterations (160 epochs on ImageNet with bsz=2048). For CIFAR10 it should be around 2000 epochs. |
ok thank you very much |
I am using my self-trained autoencoder as the encoder to train on the CIFAR-10 dataset. After 500 epochs, the loss dropped to around 0.1, but the reconstructed images are almost all white, with pixel values being quite high. I observed that the sample_tokens before decoding in the AE after sampling had very large values, with a mean reaching over 1000, while the original mean in the latent space was only about 2. I’m not sure why this is happening, and I would greatly appreciate your help in resolving this issue.
The text was updated successfully, but these errors were encountered: