Model Notes Revised - 2017/03/15

learning

Reconstruct

Learning: - $H^{word}{enc,i,j} = \begin{cases} f^{GRU}{\Theta}(x_{i,j}) & \quad i \in [1,n],j=1
f^{rnn}{\Theta}(x{i,j},H^{word}{enc,i,j-1}) & \quad i \in [1,n],j \in [2,m] \end{cases} $ $H^{con}{enc,i} = \begin{cases} f^{GRU}{\Phi}(H^{word}{enc,i,-1}) & \quad i=1
f^{GRU}{\Phi}(H^{word}{enc,i,-1},H^{con}{enc,i-1}) & \quad i \in [2,n] \end{cases} $ $H^{enc2lat}{i} = \begin{cases} \begin{cases} f^{rnn}{\vec z_1}(init) & \quad i \prime=n
f^{rnn}
{\vec z_1}(H^{con}{enc,i\prime+1},H^{con \prime}{enc,i\prime+1}) & \quad i \prime \in[n-1,1]
\end{cases}& \quad i\prime=i=1
f^{rnn}{\vec z_i}(H^{con}{enc,i},\vec z_{i-1}) \text{ or concat\sum}(H^{con}{enc,i},\vec z{i-1}) & \quad i \in [2,n] \end{cases} $

$\vec z_i=\sim\mathcal{N}(.|f^{mlp}{\mu}(H^{enc2lat}{i}),f^{mlp}{\sigma}(H^{enc2lat}{i}))$ $H^{con}{dec,i} =f{lat2dec}(\vec z_i)$ $ H^{word}{dec,i,j} = \begin{cases} f^{GRU}{\Omega}(H^{con}{dec,i}) & \quad i \in [1,n],j =1
f^{GRU}
{\Omega}(H^{con}{dec,i},\hat x{i,j},H^{word}_{dec,i,j-1}) & \quad i \in [1,n],j \in [2,m] \end{cases} $

Reconstruct:

$H^{enc2lat}{i} =f^{rnn}{\vec z_i}(H^{con}{dec,i-1},\vec z{i-1}) \text{ or concat\sum}(H^{con}{dec,i-1},\vec z{i-1}) \quad i \in[2,n] $ $\vec{\mu_1}{empirical}=avg(\vec{\mu_1}{training})$ $\vec{\sigma_1}{empirical}=avg(\vec{\sigma_1}{training})$ $\vec{z_i} \sim \begin{cases} \mathcal{N}(\vec{\mu_1}{empirical},\vec{\sigma_1}{empirical}) & \quad i=1
\mathcal{N}(f^{mlp}{\mu}(H^{enc2lat}{i}),f^{mlp}{\sigma}(H^{enc2lat}{i})) & \quad i \in[2,n] \end{cases} $ $H^{con}{dec,i} =f{lat2dec}(\vec z_i)$ $ H^{word}{dec,i,j} = \begin{cases} f^{GRU}{\Omega}(H^{con}{dec,i}) & \quad i \in [1,n],j =1
f^{GRU}
{\Omega}(H^{con}{dec,i},\hat x{i,j},H^{word}_{dec,i,j-1}) & \quad i \in [1,n],j \in [2,m] \end{cases} $

KL Objective - $ \begin{equation} \begin{split} \log p_{\theta}(x)&= \log \int_{z} p_{\theta}(x,z) &
&= \log \int_{z} q_\phi (z|x) \frac{p_{\theta}(x,z)}{q_\phi(z|x)} &
&\ge \int_{z} q(z|x) \log \frac{p(x,z)}{q(z|x)} \text{(Jensen’s inequality)} &
&= \mathbb E_{z\sim q(z|x)} [\log p(x,z)-q(z|x)] &
&\text{if }\log p(x,z)=\log p(x)+\log p(z|x) &\text{else }\log p(x,z)=\log p(x|z)+\log p(z)
&= \mathbb E_{z\sim q(z|x)} [\log p(x)+\log p(z|x)-q(z|x)] & =\mathbb E_{z\sim q(z|x)} [\log p(x|z)+\log p(z)-q(z|x)]
&= \mathbb E_{z\sim q(z|x)} [\log p(x)-[-\log p(z|x)+q(z|x)]] & =\mathbb E_{z\sim q(z|x)} [\log p(x|z)-[-\log p(z)+q(z|x)]]
&= \mathbb E_{z\sim q(z|x)} [\log p(x)-\log \frac{q(z|x)}{p(z|x)}] &=\mathbb E_{z\sim q(z|x)} [\log p(x|z)-\log \frac{q(z|x)}{p(z)}]
&= - \mathbb E_{z\sim q(z|x)} [\log \frac{q(z|x)}{p(z|x)}]+\log p(x) &=- \mathbb E_{z\sim q(z|x)} [\log \frac{q(z|x)}{p(z)}]+\log p(x|z)
&= - D_{KL}(q_\phi(z|x)||p_\theta(z|x))+\log p_\theta(x) &=- D_{KL}(q_\phi(z|x)||p_\theta(z))+\log p_\theta(x|z)
& &= {\cal L}(x,\theta,\phi)&
\end{split} \end{equation} $

KL Objective -