States and Prediction

In order to start working on prediction and learning for DM3, I’ve starting dumping some state data to see what it looks like over time. The idea is that learning will only happen once the max clusters have been reached for foreground or background, and there would be a different instance of the learner for foreground and background. This is because both RL and MLP methods require a fixed number of dimensions. Once the max clusters have been reached, all new percepts are merged with the nearest cluster, and thus after that point we can represent any image in the system as a vector of booleans where each element corresponds to one of the clusters. In the following plot, the rows represent each moment in time, while the columns represent particular percepts. Black percepts do not appear at that moment in time, while grey percepts do.


This is a quite short test, but it’s quite obvious that the percepts on the right of the vector are more most often activated. This is because the merging process removes percepts from anywhere in the list, and appends the newly merged clusters onto the right. The result is that for each perceptual frame, the percepts that are activated are those at the end of the list. For background percepts (left panel) it seems that a simple algorithm would be highly predictive, just activate the 50 percepts on the right of the list, and add some noise. During perception, the percepts on the right of the list will always be the most recently clustered. It seems that the inadvertent structure of the list of clusters causes it to nearly be sorted by time. If this is true, even a sliding window from left to right would be a pretty good model. The question is whether this pattern continues beyond this short test. As for foreground percepts, structure is much less clear. Most images don’t have any foreground percepts, hence the predominance of black. Additionally, the number of foreground percepts is quite small. It seems that a similar model would hold, where the number of activated percepts is a very small number with a lot of noise. It does appear that the presence of foreground percepts tends to happen in clumps, which makes sense since only a very fast moving percept could appear only in a single frame. The next longer-term test will clarify this pattern. Perhaps there is also some periodicity beyond the day-night cycle that the predictor could learn. It makes sense that rush hours and lunch would be more active than the rest of the day.

This begs the question as to what temporal representation is like in dreams. Do dreams represent a compression of time such that only interesting events occur? As discussed in the Integrative Theory, bizarre dreams are unusual and most dreams are quite ordinary. Since much of the experience of the system would be a slightly modulating plain background image, it would make sense that it would dream of such, but that could certainly be boring for the viewer. It seems there needs to be a role of habituation in the learning algorithm. The conception of habituation for the associative conception of DM3 was that the degree of activation would be lowered for an activated percept the more often it is activated. In the absence of activation, a percept’s ability to be activated would recover (dishabituation) incrementally.

So if these boolean values are changed to continuous degree of activation, then habituation could be manifest in that degree of activation and that could be fed into the predictor. The pattern would obviously be much more complex and more difficult to learn, but it would reflect a much more nuanced sense of the sequence of events. It is also possible that the appearance of novel (less habituated) percepts are difficult to predict and appear more as noise. Still, the system learning from novel stimulus would be more cognitively realistic because those novel percepts would be more salient.