Presented by Christabella Irwanto
\[ \mathcal { L } ( \theta ) = - \mathbb { E } _ { f \sim P } \left[ \mathbb { E } _ { N } \left[ \log Q _ { \theta } \left( \left\{ y _ { i } \right\} _ { i = 0 } ^ { n - 1 } | O _ { N } , \left\{ x _ { i } \right\} _ { i = 0 } ^ { n - 1 } \right) \right] \right] \]
dist = tf.contrib.distributions.MultivariateNormalDiag(
loc=mu, scale_diag=sigma)
log_p = dist.log_prob(target_y)
loss = -tf.reduce_mean(log_prob)
train_step = optimizer.minimize(loss)
\[\text {ELBO} = \mathbb{E}_{q(z | C, T)} \left[ \sum_{t=1}^T \log p(y_t^{\ast} | z, x_t^{\ast}) + \log \frac{q(z | C)}{q(z | C, T)} \right]\]
(e_1): Linear(in_features=3, out_features=400, bias=True)
(e_2): Linear(in_features=400, out_features=400, bias=True)
(e_3): Linear(in_features=400, out_features=128, bias=True)
(r_to_z_mean): Linear(in_features=128, out_features=128, bias=True)
(r_to_z_logvar): Linear(in_features=128, out_features=128, bias=True)
(d_1): Linear(in_features=130, out_features=400, bias=True)
(d_2): Linear(in_features=400, out_features=400, bias=True)
(d_3): Linear(in_features=400, out_features=400, bias=True)
(d_4): Linear(in_features=400, out_features=400, bias=True)
(d_5): Linear(in_features=400, out_features=1, bias=True)
400 epochs took 7 hours on NVIDIA Quadro P5000 GPU (Paniikki computer)
sigma = 0.1 + 0.9 * tf.sigmoid(log_sigma)
to avoid zero standard deviation…