5.2 Future Work
5.2.3 The Predictive Model
I acknowledge that there is room for improvement for the proposed predict-ive model, especially regarding the choice of loss functions and the weighting of these. Also, training this model end-to-end with a gradient-based optim-isation method remains a challenge due to the inhibited flow of gradients in the VQ-VAE. Nevertheless, the model has proven to be appropriate for predicting long-term futures of a visual environment, and its capabilities due to its use of discrete variables lead to other interesting ideas for future work:
The current predictive model is deterministic, designing a probabilistic variant could be possible by for example introducing multinomial sampling in the output of the memory component. Another idea involves the use of
transformers (Vaswani et al., 2017), which are sequence models shown to learn sequences of discrete variables effectively. An interesting experiment would, therefore, be to replacing the memory component’s LSTM network with a transformer network.
