150,000 frames, Single epoch sequential training

Posted: November 22, 2013 at 8:23 pm

After the previous tests I’ve gotten a better sense of the prediction problem. We realized there may not be enough data in my previous tests to get sufficient training (corresponding to a few days of the full system processing). Additionally, I found a few issues with the segmentation code that could have changed the behaviour of clusters over time. I took the 6 days necessary to train a new data-set. The full data-set is composed of only the day time periods (including sunset and sunrise), and includes approximately 300,000 frames. In 6 days I processed approximately half the set, 150,000 frames. Note that this is actually significantly more data (for the same number of frames) compared to previous examples due to their inclusion of night frames. Following is the resulting error from same MLP learning procedure as used previously, presenting each pattern only once without repeated epoch training, and reporting error after each iteration:

149995_backgroundState-Days-AB-filter-1-sequential.error

As learning is going to be on-line, it is expected that the error will rise and fall as a result of stability or the lack thereof in input frames. Following is the results where the input is on the left and the raw (non-thresholded) output is on the right:

149995_backgroundState-Days-AB-filter-1-sequential.results

The following is the corresponding histogram that shows the distribution of the number of active percepts for each time-slice:

149995_backgroundState-Days-AB-filter-1-sequential.results.hist

Clearly not a good result; The distribution of active clusters over time is uniform. I tried a quick dream simulation with this network, and using the same initial conditions as before we ended up with a stable (static) dream. I have not gotten to writing the program that generates a number of dreams for different initial conditions; It is meant to give an indication of the range of periodicity or stability in the dreams. I’m not sure the best way to approach choosing the input patterns. Using the data-set inputs makes sense, but does not tell us what the behaviour would be with novel input (which is more than possible in the real system). Maybe this analysis is not needed because the current idea is not to gate input to the system entirelyA little spontaneous noise would always be injecting some variation into the system. I imagine this would make stable (static) output much less likely. I’m meeting Steven tomorrow to discuss the predictive conception of the Integrative Theory, and one of the main topics is this gating question.