Toy Dreams

Posted: January 5, 2017 at 5:15 pm

I’ve been doing a lot of reading and tutorials to get a sense of what I need to do for the “Dreaming” side of this project. I initially planned to use Tensorflow, but found it too low level and could not find enough examples, so I ended up using Keras. Performance using Keras should be very close, since I’m using tensorflow as the back-end. I created the following simple toy sequence to play with:

The Y axis is time and the X axis is a proxy for the distribution of cluster positions. This simple sequence is just meant to be easy to visually recognize. According to the conception of dreaming I’m working with, dreams are feedback within a predictive model trained on waking experience. As I learned in my PhD, my use of feedback to reconstruct the sequence (where the output of the network is used as the next input) is quite an unusual use case for ML. Even if an MLP properly learns the sequence (such that it will produce the correct t+1 pattern when fed pattern t), the proper sequence is not likely to be constructed beyond the second iteration in feedback, see this post for my previous results using an MLP.

The plan is to use a network with more layers and LSTM units that allow the network to learn state which hopefully will allow feedback reconstructions to more closely resemble original sequences. I’m currently using a three layer network with a single LSTM layer, and two dense layers, all with the same number of units as there are inputs (11 in this toy case). The network was trained using epochs and resulted in a 9.8322e-07 MSE. This is over-trained, but I’m not concerned with that for this project. The following image shows the sequence reconstruction using feedback. Note that it is already working much better than the MLP approach used in the PhD.

The feedback was initiated using the first input pattern (the vector at the top of the first image in this post), and I also rounded float predictions to ints before feeding them back. The first two iterations of feedback are the same as the original sequence, and then things start diverging from there. Note, the aim here is a generative process that creates sequences that resemble the input, but are not identical to it. The sums of each row are important, because those sums define the number of percepts visible for that frame. In this case, the sums are quite consistent with the original sequence. The secondary staircases are problematic though, as columns have no relationship to each other; although we can see the staircase pattern, its location would make it unreadable to the viewer.

Of course this toy example is overly simple and its hard to predict what will happen with the real data. I’ll work with a subset of which for the next stage and see if anything can be learned from that using this network architecture.