Reconstruction loss in ELBO #51

Paulmzr · 2024-09-19T09:18:37Z

Hi, thanks for your great work!

I notice there is a discretized_gaussian_log_likelihood function to estimate the log-likelihood of the reconstructed representation from $x_1$. As the VAE has already encoded the images to continuous latent space, I am confused why we need this function to estimate the log-likelihood of a Gaussian distribution discretizing to an image ground truth? Why not we directly use the MSE loss ( i.e., $|x_0 - {x^{reconstruct}_0}|$) to optimize the log-likelihood of the reconstructed latent representation?

Looking forward to your reply. Thanks in advance!

The text was updated successfully, but these errors were encountered:

LTH14 · 2024-09-19T13:28:48Z

Thanks for your interest! This VLB loss exactly follows the iDDPM and DiT design. However, we also conducted experiments without the VLB loss (reconstruction loss only), and the performance is the same.

Paulmzr · 2024-09-20T07:23:47Z

@LTH14 thanks for your response!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconstruction loss in ELBO #51

Reconstruction loss in ELBO #51

Paulmzr commented Sep 19, 2024

LTH14 commented Sep 19, 2024

Paulmzr commented Sep 20, 2024 •

edited

Loading

Reconstruction loss in ELBO #51

Reconstruction loss in ELBO #51

Comments

Paulmzr commented Sep 19, 2024

LTH14 commented Sep 19, 2024

Paulmzr commented Sep 20, 2024 • edited Loading

Paulmzr commented Sep 20, 2024 •

edited

Loading