STM.transpose() is an experiment with an unfolded version of LSTMs. The hypothesis is that The gradients of a deep Neural Network following the same architecture of the LSTM unfolded through time (even those of the bottom layers) are efficiently trainable with Backpropagation, and won’t be affected by the ‘vanishing gradient’ problem. This is the case even when the weights are not ‘tied’
The code is public and would like to have more people collaborate with us to make it better.
Date: Nov 5, 2016
Author: John Gamboa