As I’m reading on mental imagery, and mental representation I’m torn between the Constructivist and Empiricist positions on development, as described by Muller, Sokol and Overton. On one hand their is the idea that mental representation is consciously constructed in relation to the world in an embodied fashion. The cornerstone of this position is that representation exists to serve the intentionally directed will of the agent. On the other hand there is the notion that the relationship between mental representation and the world is a causal one. The world impacts itself on the agent, who passively receives signals that are transformed into representations.
Following is the final presentation and paper created for IAT888 (metacreation) for the perception and synthesis system of DM3. Here is a teaser of one of the automated accumulations:
It appears that all the SOM weights fed by edge-detection are very similar:
Corresponding to these resulting accumulations:
I think the poor accumulations are due to poor distance measures considering the edge-detections. The next step is to break the objects into a few Euclidean blocks, and calculate hists from those blocks. They should be much more flexible regarding object edges than the edge-detection. Basically it appears that two edge-detections of the same object in different orientations are as different as edge-detections of two different objects.
For this project the major value of the functional distinction, as proposed by Mishkin, Ungerleider, and Macko (1983), between the ventral and dorsal streams of the visual system are:
- The ventral stream is specialized for object recognition.
- The dorsal stream is specialized for object location and context.
- Brodmann area 40 (BA40), which is correlated with dreams, is in an interesting location between dorsal and ventral streams.
The idea was that BA40 may integrate from both ventral and dorsal streams, allowing dreams to combine contexts with objects in surprising ways. Unfortunately many studies refute the claims that the dorsal visual stream is correlated with object contexts and locations, and that it actually deals with sensorimotor control in visually guided behaviour (Goodale and Westwood, 2004).
There is evidence that a functional dissociation occurs in the processing of scenes/contexts and objects, but in the ventral stream. The hippocampus is associated with the processing of contexts and locations, while the perirhinal cortex is associated with object recognition (Murray, Bussey & Saksida, 2007). The project will continue with two subsystems, one for objects, and the other for contexts and locations. These ideas have been integrated into the working document: dreaming-machine-perception-synthesis.html.
Since this project is meant for the complex, ever changing real-world, I’ve moved from the simple toy examples (balls and coasters) to outdoor scenes. Unfortunately things are not working very well.
In order to make a greater contribution and situate this work in what is known about the visual system is a separation of the two major streams from the primary visual cortex. The dorsal stream (occipital to parietal lobe) is considered, by one theory, the “where” region of the visual system that is associated with locations and places. The ventral stream (occipital to temporal) is the “what” region associated with particular classes of objects.
The idea is to separate the visual analysis into these two streams. Some rough ideas regarding how this could work is in this document. It is a subset of the “Dreaming Machine #3 Notes” document posted previously and contains some additional ideas for the final paper, in particular a first attempt to map the system to biological processes.
I put a quick SOM into the current test patch to see how well it deals with this idealized object data. The SOM is a 2×2, trained using a constant learning rate of 0.5 and a constant neighbourhood size of 1. I did not keep track of the number of training iterations. As proposed the images were abstracted into a RGB histogram (768 values) and a 40×30 pixel edge-detection before being fed into the SOM.
The following images illustrate how the SOM would accumulate the images. These are not accumulated in PD, but manually, but are layered based on how the SOM would choose to accumulate them:
Here is the simulated accumulation of the same coaster images discussed in this previous post. This time they are cropped by the single largest contour found. Note how much less emphasis the poorly registered image has.
Here is the 3 page proposal I’ve written for the meta-creation class. It describes the system in more detailed terms and discusses the whole system. Philippe’s feedback was that this perception component does not make enough contribution to the field. He suggested that I either choose a more interested method of object segmentation, perhaps a more biologically oriented one, or use a different type of SOM, like a GSOM. A GSOM is a SOM that increases the number of units depending on the quantization error (QE) calculated for each unit. A high QE means too many inputs are associated with a particular unit, and that the map is therefore likely too small.
Next steps are to look into other object segmentation methods, continue working on implementing the system, as currently proposed, in gridflow, and to look into implementing a SOM, and then a GSOM, in gridflow.
The perception/synthesis project is a component of the next “Dreaming Machine” installation. In DM1 and 2 entire images were stored, not components of images. The purpose of this project is to determine a method for extracting components from an image, but without a complex shape recognition system.