I’m currently running a long test of the longest contiguous part of the data-set, but in previous tests (with much shorter training periods) the dreams have been shown to be quite static. Following is a sample frame from one of these runs. In this case the dream is the sum of perceptual activation and predictor feedback:
Once the current long test is completed we’ll see if this holds over more training samples. Of course since the predictor is learning from the perceptual clusters, it depends on their stability (or lack thereof). The combination of a few visible percepts, and their extremely ephemeral quality, makes the dream and mind-wandering images quite unsatisfactory. This is emphasized by the fact that the sensory image is not rendered in the background during dreaming and mind-wandering, leaving the percepts to stand on their own.
I was thinking about solutions to this problem at the level of the percept clustering method. In looking at the memory leak, it was suggested that I remove the percepUnit images from the percepUnit class and store them in a separate structure. While this will complicate the code a bit, it does have some potential improvements. Hopefully by removing the cv::Mats from the class, the memory leak will stop. Additionally, performance could be improved if the images are stored in a buffer of constant resolution. This would mean that percept images would not be cropped, but left at full frame resolution. This would clearly use more memory, but considering the memory leak that may not be a problem. The additional benefit is that the percepts will be locked in their sensed positions, which means that they will be a lot more stable over time because the instability of the segmented regions will only effect their edges, not their positions. The only issue with this method is that moving foreground objects will not be merged into clusters because their shifting position (in relation to the frame) will cause them to disappear as in a long-exposure. We’re not actually loosing anything here though, because the same thing happens with the current system, moving objects disappear.
Another contributor to the ephemeral quality of percepts is the large size of their regions. The test data is so complex that by breaking the image into 100 regions, many of those regions are not entirely perceptually contiguous, as is fairly clear from the original and mean-shift segmented image below, where the white black truck, grass, asphalt and dark areas of the building are all considered the same region:
The solution to this is obviously to increase the number of segmented regions. Paradoxically, this would actually lead to a decrease in segmentation time because significant resources are used to grow smaller regions into the larger areas requested. The increase will happen in the clustering time where more percepts will need to be compared for each frame. Additionally, the total maximum number of clusters (now set to 2000) is far too small to encapsulate any significant breadth of the complexity of the visual scene changing over time.
If I would have to choose between having the system dream of foreground objects or increasing the fidelity of the clusters and increasing the density and visual complexity of the dreams with more percepts, I would easily choose the latter. An additional problem with this approach is that the current conception of arousal is based on the changes of the scene, which corresponds to moving foreground objects. Thus, the system is learning predictions of moving objects, but manifesting that learning in percepts that don’t well represent those moving objects. Perhaps these changes could allow the previous conceptions of arousal to be implemented (a threshold of the MSE in the clustering process, or the number of percepts whose state (present or not) changed in subsequent frames).