You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice there is a discretized_gaussian_log_likelihood function to estimate the log-likelihood of the reconstructed representation from $x_1$. As the VAE has already encoded the images to continuous latent space, I am confused why we need this function to estimate the log-likelihood of a Gaussian distribution discretizing to an image ground truth? Why not we directly use the MSE loss ( i.e., $|x_0 - {x^{reconstruct}_0}|$) to optimize the log-likelihood of the reconstructed latent representation?
Looking forward to your reply. Thanks in advance!
The text was updated successfully, but these errors were encountered:
Thanks for your interest! This VLB loss exactly follows the iDDPM and DiT design. However, we also conducted experiments without the VLB loss (reconstruction loss only), and the performance is the same.
Hi, thanks for your great work!
I notice there is a$x_1$ . As the VAE has already encoded the images to continuous latent space, I am confused why we need this function to estimate the log-likelihood of a Gaussian distribution discretizing to an image ground truth? Why not we directly use the MSE loss ( i.e., $|x_0 - {x^{reconstruct}_0}|$ ) to optimize the log-likelihood of the reconstructed latent representation?
discretized_gaussian_log_likelihood
function to estimate the log-likelihood of the reconstructed representation fromLooking forward to your reply. Thanks in advance!
The text was updated successfully, but these errors were encountered: