Why DeepDream has nothing to do with Dreaming

Posted: June 19, 2015 at 11:35 am

A few people have emailed me to ask what I think about this, the method used to generate the image above. So I thought I would post the synthesis of my thoughts here. I was asked because my Ph.D. was about the mechanisms of dreaming. So what does this project have to do with dreaming? Well it turns out, not much at all. There are a few reasons why dreaming got folded into the discourse:

(a) The (or one of the) network(s) used is “codenamed Inception, which derives its name from the network in network paper by Lin et al [12] in conjunction with the famous ‘we need to go deeper’ internet meme”. Reference 12 does not contain the words “inception” nor “dream”. The meme refers to film “Inception” about sharing dreams. The intended link here seems to be the aspect of layers of depth in the “Network in Network”, which relates to the film depiction of the decent into dreams within dreams.

(b) The research blog post (below) contains the word dream once (twice if we count the occurrence in the comments). In the text, ‘dream’ (their quotes) are used in the context of feeding the ‘inversion’ algorithm a random field ‘prior’. The implication is that dreams are the making sense of random activations, a quite dated conception of dreaming (aligned with the early activation-synthesis hypothesis, Hobson & McCarly) which has been revised within a predictive context (The free energy minimization formalization of the Activation, Input-Output Gating, Modulation model, Hobson & Friston).

These two minor points (the ‘codename’ of the network architecture, and the caption of the image shown above) have exploded into unfairly dropping this project into the realm of dreaming. It is amazing how much cultural clout there is to the notion of dreaming as random activations in REM, which has certainly invaded technical realms. Just the fact that these are still images implies they are not dreams, but images. That they are trained on highly constrained data is what allows them to look resolved. I wonder what would happen if they were trained on arbitrary continuous data (e.g. a live camera). Probably something a lot less interesting.

So if project is not about dreaming, what is it about? Imagery.

Before this came up on my radar (yesterday), I was used to these types of images generated from trained deep learning networks, which are comparatively quite underwhelming to say the least! The difference is related to the a method of ‘inverting’ the abstraction process of a deep learning algorithm. Normally a deep network takes a bunch of images, and generalizes a set of high level abstractions (level of abstraction corresponding to layers in the network) for the set of input images. As I understand, these images I had been used to are the raw representations, not inversions of the network. When they invert the network, they flip the relation around and construct new images from these high level representations constructed by an already trained network. In order to do this they provide a ‘prior’ image that is used to ‘regularize’ the task of reconstructing images from high level abstractions.

So why are these images so compelling? There are a lot of reasons, the super saturated colours and clearly recursive nature are the main contributors. I think there is something quite compelling in the requirement of the prior image in the inversion process, which potentially sets up a tension between the prior and data-set on which the network is trained. I think the tension between the data-set and prior image is significant. When the prior is noise, then the iterative process is exposed more clearly, leading to fractal and M.C. Escher like imagery. The labelling of these images as ‘dreams’ reinforces the notion of dreams as predominantly bizarre, which is a problematic stance. (See my TEDx talk.)

Some studies of lucid dreamers and those with seizures that effect the temporal lobe show that images are not as distinct as those in external perception. This corresponds to the approach I have taken in my own work, where I focus dreaming as temporal simulation of learned predictions. This emphasis lead me to a simpler use of clustering of colour (rather than biologically inspired features) to generate fuzzy and indistinct ‘percepts’, as exemplified in the test image following this post.

No doubt, the computer vision / imagery aspect Google is working on is much more advanced than mine, and I can imagine that putting the two projects together could lead to some very interesting results. I wonder what the HW and data-set size is required for these results…

While I wait for word on funding, I’ve starting worked on an offline version of Watching and Dreaming, which learns from popular cinematic depictions of AI, and I’ll starting with Ridley Scott’s Blade Runner. I hope to be posting some new images in the next few weeks.