Improve Adadelta doc.

PiperOrigin-RevId: 236469934
This commit is contained in:
Zhenyu Tan 2019-03-02 10:36:31 -08:00 committed by TensorFlower Gardener
parent 2a991bbccc
commit 7d58fd5675

View File

@ -41,13 +41,14 @@ class Adadelta(optimizer_v2.OptimizerV2):
Initialization: Initialization:
$$accum_g_0 := 0 \text{(Initialize gradient 2nd order moment vector)}$$ $$E[g^2]_0 := 0 \text{(Initialize gradient 2nd order moment vector)}$$
$$accum_x_0 := 0 \text{(Initialize variable update 2nd order moment vector)}$$ $$E[\Delta x^2]_0 := 0 \text{(Initialize 2nd order variable update)}$$
$$t := t + 1$$ $$t := t + 1$$
$$accum_g_t := rho * accum_g_{t-1} + (1 - rho) * g * g$$ $$E[g^2]_t := \rho * E[g^2]_{t-1} + (1 - \rho) * g^2$$
$$delta = -\sqrt{accum_x_{t-1}} / (\sqrt{accum_g_{t-1}} + \epsilon)$$ $$\Delta x_t = -RMS[\Delta x]_{t-1} * g_t / RMS[g]_t$$
$$accum_x_t := rho * accum_x_{t-1} + (1 - rho) * delta * delta$$ $$E[\Delta x^2]_t := \rho * E[\Delta x^2]_{t-1} + (1 - \rho) * \Delta x_t^2$$
$$x_t := x_{t-1} + \Delta x_{t}
References References
See [M. D. Zeiler](http://arxiv.org/abs/1212.5701) See [M. D. Zeiler](http://arxiv.org/abs/1212.5701)