Skip to content
This repository has been archived by the owner on May 21, 2022. It is now read-only.

Adam optimizer #13

Open
CorySimon opened this issue Mar 21, 2017 · 1 comment
Open

Adam optimizer #13

CorySimon opened this issue Mar 21, 2017 · 1 comment

Comments

@CorySimon
Copy link

This package is really useful as learning rate updaters. I'm using a variant of the Adam scheme here for SGD.

I think it is unnecessary to have \rho_i^t as vectors. Shouldn't these be Float64's?
Also, pedantic, I'm not sure why they are called \rho instead of \beta.
https://github.com/JuliaML/StochasticOptimization.jl/blob/master/src/paramupdaters.jl#L123-L124

@CorySimon
Copy link
Author

Also, comparing to the paper,
https://arxiv.org/pdf/1412.6980.pdf
the update of \theta is not correct for the Adam optimizer.
Shouldn't it be:

θ[i] -= α * m[i] / (1.0 - β₁ᵗ) * sqrt(1.0 - β₂ᵗ) / (sqrt(v[i]) + ϵ * sqrt(1.0 - β₂ᵗ))

Please confirm that I am correct, and I will make a pull request. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant