Free Energy, Prediction and MDP

Posted: March 5, 2013 at 10:31 am

I just finished reading a new Hobson paper (“Waking and dreaming consciousness: Neurobiological and functional considerations”), which is an update on Hobson’s AIM model integrating Friston’s “free energy formulation”. The key points are that we can consider waking perception as a learning process where the difference between what happens and what is expected drives more accurate predictions. REM sleep and waking are contiguous processes, where the lack of external stimulus during REM means there are no prediction errors, which triggers dream experiences. In my reading it appears that Hobson proposes that visual images in dreams are the result of the ocular movements themselves (REMs) predicting visual percepts. Hobson proposes that dreams are of functional use because they manifest an optimization process: “one can improve models by removing redundant parameters to optimize prior beliefs. In our context, this corresponds to pruning redundant synaptic connections.”. In short, dreaming improves the quality of the predictive model of the world in the absence of sensory information, by pruning.

On this first reading it appears there is an extremely rich overlap between Hobson and Friston’s conception and the proposal of contextual learning described in the previous post. The math was not fully explained in this paper, so it looks like I’ll have to get Friston’s latest review of the “Free-Energy” conception of brain to make proper sense of it. It does appear to make use of some complex differential relations, which are not likely to improve my comprehension. The proposal is highly Bayesian oriented, and its unclear how that formalism effects current effort to resolve the notion proposed in the previous post and and MDP (Markov Decision Process). What we gain with MDP is a set of very well known and highly functional algorithms that learn complex series of actions through negative and positive rewards that may only be applied at the end of the chain of actions. An MDP makes actions, in the context of a particular state, that aim to maximize reward through the transition to a new state. One possible mapping (provided by Philippe) of the current conception to MDP is as follows:

  • State Space: All the possible combinations of percepts to build mental images (the state of activation of all percepts)
  • Action: The selection of which percepts to prime (predict) next.
  • Reward: Inverse of the distance between the primed and perceived patterns of activation.

Under the MDP framework, there may be some issues with the size of the state space and the number of possible actions in terms of scalability. A major theoretical issue is the lack of an obvious pruning process, as the knowledge learned by the system is not manifested in links between nodes. Perhaps after looking more deeply into the MDP framework may uncover some variable that could be similar to a pruning process, but it seems that perhaps MDP is not appropriate. Since Free Energy is considered in a Bayesian framework, perhaps it would be worthwhile to look the proposal in the previous post in relation to Bayesian networks.

In our last meeting, Philippe mentioned that I should implement what is currently in mind for the lower components of the system (segmentation and clustering), as that foundation will help inform the next set of choices for higher level learning systems. This involves putting the existing code for segmentation into a class that makes use of OpenCV functions, but could be used in either OpenCV or OpenFrameworks (because we are aiming for a large amount of perceptual material OpenGL rendering of dreams may prove inappropriate) Rendering over 1000 percepts during NFF certainly slowed down rendering. In that case 1000 percepts seemed like a small amount because it was spread over one dream. For a single frame 1000 percepts would be more than enough, which matches the number of percepts in the print images produced during NFF.