Posted: February 21, 2013 at 2:22 pm
I met with Steven Barnes today to talk about the previous post and through discussion we clarified some of the points of the learning algorithm proposed earlier. Lets go through an example of three subsequent frames (t=1,2,3) in a perceptual case. For simplicity we are do not describe the clustering process here, though the algorithm may have unforeseen consequences in relation to clustering. The basic premise is that priming is predictive and therefore that future percepts are expected to be in the same context as current percepts.
All percepts segmented in t=1 serve as the initial context. Each percept has its own sense of context, which is manifest in a hash table that includes the link strength from each percept to all other percepts. A link strength of zero indicates that a pair of percepts do not appear in the same context. Link strengths are initially determined by Euclidean distance between the two percepts in X, Y and T (where time is just considered another dimension). Since all percepts in the first frame occur at the same time, their links are the same as their distances within the frame. Two percepts that are very close together are strongly linked, while two percepts located in opposite corners are less strongly linked.
All percepts segmented in t=2 create links to all percepts in t=1, again according to their Euclidean distance, this time including time. Percepts in t=2 that are closest to percepts in t=1 (in space and time) are linked to a greater degree. Percepts in t=2 set activation in percepts in t=1, causing them to be primed according to the strength of the link. Percepts in t=1 are primed because it is expected that the context that links t=1 and t=2 will be extended in t=3. Primed percepts are expected to occur in the next frame.
For each segmented percept (activated by perception):
- If the percept in t=3 is linked to any percept in t=2, then strengthen the link between the percept in t=3 and all linked percepts in t=2. This occurs when the prediction is correct for a particular percept.
- If the percept in t=3 is not linked to any percept in t=2, then create a link between percept in t=3 to every percept in t=2 according to their Euclidean distance (in space and time). This situates the new stimulus in the context of the previously activated percepts.
If a percept has been primed (by percepts at t=2), but not present (not activated perceptually), then weaken the link between this percept at t=2 and the percepts in t=1 it is linked to. This occurs when the prediction fails, and weakens the links in the context that made the prediction.
This learning mechanism allows the system to learn predictive contexts of percepts, which are represented as a graph of relations between percepts. If we activate a particular percept, and allow it to propagate that activation through its links, then percepts that are nearest (are the most related to the context of the initial percept) are the most activated. Each of those linked percepts then continue to propagate activation, activating percepts further and further removed contexts, radiating outwards in space and time from the initial activation. Assuming a degree of signal decay during activation, a weak activation would cause the activation of the immediate context, while a strong activation could activate nearly the whole network (assuming there are no island contexts whose percepts only link to each other). This is the type of activation that occurs during perception, as initiated by external stimulus. Lets say at t=4 an external stimulus causes the activation of percept A, that would cause the activation of all percepts in the context of A, including those expected to occur in t=5 (for example percept B). If those percepts do occur in t=5, then they are already activated, so other than learning, how is this double activation (percept B is activated by percept A in the previous time-step, and also activated perceptually at t=5) manifested visually? The answer could be in a consideration of the confidence of merged percepts.
Confidence is a measure of the variance of a clustered percept. A clustered percept is composed of pixel regions segmented at various frames that are summed into a prototype representation. While the constituent pixel regions must satisfy a global threshold to be merged, there is still likely to be some variance in the constituents. Perhaps the degree to which percepts are initially activated should be proportional to the percepts confidence. Percepts with a low degree of confidence (high degree of variance) would be activated to a lesser degree. The high degree of variance indicates that these percepts were only partially recognized, and therefore it is reasonable for them to be presented with less visual emphasis. Percepts with a high degree of confidence are strongly recognized, and visually presented with more emphasis. Weakly activated percepts with low confidence could be boosted in activation by priming: An expected, but weakly recognized, percept may be presented with the same emphasis as a predicted percept with a high degree of confidence.
Early on in the development of this project it was clear that in human perception high level cognition change perception. In the current conception of the system we have the clustering process which constructs perception, but those parameters are not determined from learning, but are fixed. The segmentation algorithms are also fixed and not effected by learning. At this point, the perception of individual percepts is only effects by higher level processes is by the modulation of their activation (due to priming). The proposal that percepts with low confidence (discussed above) be presented with less opacity (less activation), could allow some flexibility in the threshold of similarity used in clustering. Perhaps the variance of confidence for all percepts could modulate the clustering threshold to maintain a particular distribution of confidence across percepts.