I have not posted in a while due to keeping busy with preparing for the New Forms Festival. I collected quite a lot of images (~750,000), enough for a few full day cycles which will be good to test with later. I spent most of the prep time before the exhibition working on loading percepts saved to disk by the segmentation system, and generating images in openframeworks. It was not until I was in the exhibition space for the mini-residency that I started working on a prototype of the dreaming system. I skipped the high level features and just used the low level features for association (mean L, u, v for colour, area for size, X and Y position and frame number for time).
The approach for propagation was quite simple and I ended up getting further than I expected. Since each feature is one dimensional, I realized I could use linked lists. So each feature dimension has its own linked list of pointers to percepts where the content is sorted by the corresponding value. When one percept is activated it passes its activation onto its nearest neighbour, which is either the previous or next item in the list. Signals decay proportional to the inverse of their similarity such that activation decays slightly between very similar percepts (according to that specific dimension). This is all informed by the propagation of activation in Dreaming Machine #2, except units are not whole images but composed of segmented regions.
The segmentation algorithm was working well once I realized filtering code was not working because I assumed there would be only one match for each criteria for keeping a percept. After a few days of processing, I saw the opposite problem where the system had too few percepts in memory, or rather a too constrained set of percepts without much range in features. (Part of this was due to getting to night-time in the captured frames.) I realized that exponential growth was only a problem if all remembered percepts where compared to all new percepts. It occurred to me (the day before the opening) that I needed a long term memory—a repository of percepts that are less relevant to the current context and therefore need not be compared to new percepts. I started hacking together some quick code to move older percepts into another container (a better idea is a LIDA inspired system where short-term memory percepts decay over time, but some get moved into long-term memory) when I noticed segfaults in the merging code. I ended up removing not only the LTM stuff, but also all of the merging code. The percepts used in the later images and dream animation ended up being single instances over time, with extreme filtering to keep exponential growth at bay. Due to the lack of merging the regions had very hard edges, which was not really a problem since the merging was not working very well at all (likely due to using Pearson correlation between Luv histograms for a similarity measure).
The exhibition was an artist-as-artwork style installation where I was present during gallery hours to discuss the work with the audience and continue developing the work. I had a few interesting interactions with the audience. The first surprise was how much people responded to the print images, one was even bought, and I had multiple enquires. The down-sampling of the image before generating segmentation masks caused the edges of regions to look quite hard, and almost pixelated. (I’ll include images in my next post.) People really seemed to respond to this aesthetic aspect, often citing how “painterly” the images looked. Additionally, a few of the images included poor merges of foreground objects that were obviously perceptually incorrect and yet resulted in quite interesting images that were compared to water colours by multiple viewers. The images are quite aesthetically interesting and rich, despite being caused by errors and other unforeseen results (surprise is after all, is one of the goals of the project). I’m left with a tension between what the system is “supposed” to do correctly, and the production of visually compelling images—a tension between concept and image/form. Of course there is no right answer (A camera does not resemble an eye, and therefore the images “seen” by the system are not like the images seen by a living being). The aesthetic exploration at this level is an examination of the aesthetic of computer vision algorithms more than theory of dreaming, although dreaming cannot be separated from perception. The qualities of the edges is a function of the morphology and blurring operations used to smooth regions out enough for floodfill to be effective (not create thousands of tiny regions). The dreams of a machine are a function of its “experience” in the world, and that includes any perceptual (emotional, narrative, etc.) deficits. At this stage these images stand as a point in process, and as I attempt to make the system “work”, hopefully some of the aesthetic aspects of these images remain. During the last few days of the exhibition I changed the background segmentation code to use colour regions (rather than luminosity) and also added greater blurring to soften the hard edges due to an even smaller mask image (due to the extra processing time of colour floodfill).
There was much discussion of the pictorial aspect of the images, that they appeared to represent a specific place in time, with constraints on the content of the images. This is largely due to the fixed camera position. The choice of what the camera sees obviously has a huge impact on the aesthetic of the images. The camera height, which in this case was about eye level for a ten year old, is also significant and I look forward to exploring height variations. Part of this pictorial discussion was also around the arrangement of objects in the images. In nearly all the images, the regions were presented in the same position as they were seen in the camera frame. In some cases, I simply copied the “reconstruction” image generated by the segmentation program while in other cases I controlled how many percepts should be in one image, and how they should be stacked, without changing their features (position, size, colour). The development of the dream animation lead to highly minimal results with much white space. While the print images contained one thousand or so percepts, even three times as many lead to a minimal density in the animation. I ended up using background percepts that stayed static behind the foreground percepts whose opacity was the result of dream activation. The final animation contained about six thousand foreground percepts and one thousand background percepts. This is as many as I could do without the frame-rate slowing below 30fps. I’ll post the images and a video of dream propagation in the next post.