Improve Adadelta doc.
PiperOrigin-RevId: 236469934
This commit is contained in:
parent
2a991bbccc
commit
7d58fd5675
@ -41,13 +41,14 @@ class Adadelta(optimizer_v2.OptimizerV2):
|
||||
|
||||
Initialization:
|
||||
|
||||
$$accum_g_0 := 0 \text{(Initialize gradient 2nd order moment vector)}$$
|
||||
$$accum_x_0 := 0 \text{(Initialize variable update 2nd order moment vector)}$$
|
||||
$$E[g^2]_0 := 0 \text{(Initialize gradient 2nd order moment vector)}$$
|
||||
$$E[\Delta x^2]_0 := 0 \text{(Initialize 2nd order variable update)}$$
|
||||
|
||||
$$t := t + 1$$
|
||||
$$accum_g_t := rho * accum_g_{t-1} + (1 - rho) * g * g$$
|
||||
$$delta = -\sqrt{accum_x_{t-1}} / (\sqrt{accum_g_{t-1}} + \epsilon)$$
|
||||
$$accum_x_t := rho * accum_x_{t-1} + (1 - rho) * delta * delta$$
|
||||
$$E[g^2]_t := \rho * E[g^2]_{t-1} + (1 - \rho) * g^2$$
|
||||
$$\Delta x_t = -RMS[\Delta x]_{t-1} * g_t / RMS[g]_t$$
|
||||
$$E[\Delta x^2]_t := \rho * E[\Delta x^2]_{t-1} + (1 - \rho) * \Delta x_t^2$$
|
||||
$$x_t := x_{t-1} + \Delta x_{t}
|
||||
|
||||
References
|
||||
See [M. D. Zeiler](http://arxiv.org/abs/1212.5701)
|
||||
|
Loading…
Reference in New Issue
Block a user