I’ve been experimenting with using floodFill directly on the morphology output, effectively doing the same thing as my early segmentation approach using mean-shift, but bypassing the mean-shift stage (which was highly computationally intensive). The results are promising:
Following from my previous post on the temporal instability of mean-shift segmentation I’ve been looking at the feasibility of using an edge detector to do segmentation. A simple test with Canny() showed that the edges are very very stable over time. So I went ahead and used standard OpenCV methods, using findContours() and approximating them into polygons. At this point I realized a familiar problem has returned. The vast majority of segmented regions are not regions at all, but the empty space between regions. Following is an image of the Canny output (with some morphology operations to reduce noise):
Following is a diagram of the current conception of the system. It is a high level overview where many details are omitted. A number of modules have been added from the last diagram, which are filled in blue. After looking more at LIDA it has become clear that it will not be that useful for this system. That being said, there are overlaps between these modules and some of the LIDA modules.
So I had a chance to look at the data I dumped on Monday that shows the features of patches in relation to frame numbers. The unfortunate realization is that the centre position of the patches is not a good indicator that they should be merged. The reason why is that the segmentation is very unstable over time. My tolerance for merging patches is currently 3 pixels, but after looking at the data, some patches are as much as 70 pixels off from frame to frame, because the edges are so unstable. I hope this is caused by the mean-shift segmentation, which causes a huge computational load.
My next steps are to see if another method may be more stable. I asked on the OpenCV mailinglist and someone did mention that task independent segmentation is inherently problematic, as in the general case of segmentation is an open problem. It has been argued that humans can only do it so well because of top-down control processes influencing perception. Since the segmentation does not have to be perfect I’ll look at other, less perceptual, segmentation methods that may be more stable. One promising method is using an edge detector to bound a floodfill operation. This should be more stable over time, but the regions may be strangely shaped. Better than nothing.
I had a dream last night. I’m writing about it here because it is relevant to the project in that it is an usual dream for me. Also after spending so much time reading about dreaming, I found a number of features of it quite interesting. I’ll start with the dream itself…
I’m stuck on a couple of problems, and wanted to post about them before continuing the work. These problems are highly interrelated and highly relevant to the link between theory and implementation. All problems are rooted in a single core problem, the over stimulation of the system, in relation to memory, and not activation.
In a panel discussion regarding copyright at ISEA Istanbul (2011) it occurred to me that beyond the corporate and monetary aspects of copyright is the notion of attribution, the acknowledgement that someone else has contributed to a work. This is highly relevant to my previous post on ownership.
The idea was simply that it would be interesting for each cultural artifact (sound, video, image, etc.) to have a list of contributors. When those items are remixed and recontextualized, the resulting construction would concatenate all the contributors from each component. The result would be a growing history of all those that contributed to a work. One could even imagine that each contribution could be to a degree, perhaps tied to the difference between the “original” (previous incarnation) and the remixed permutation. There could even be a section for “inspiration” where indirect attribution could be made.
Such a system would be extremely interesting in the context of the analysis and visualization of cultural artifacts, as such lists of attribution over time have much potential to illuminate how ideas and forms propagate through culture.
After spending a week trying to debug memory corruption error I found the problem, I was attempting to write outside of the bounds on the reconstruction image, because a number of percepts were the size of the entire frame. Once I put in a condition to ignore these large percepts (segmentation failures), I got reconstructions that look like this:
One of the major (circa 1988) theories of consciousness coming out of cognitive science is Baars “Global Workspace Theory”, which can be explained using the metaphor that the mind is like a stage. There are many actors, only some of which are on the stage and many audience members that watch the performance of the actors on stage. Consciousness is like a spotlight that shines on the stage, allowing the actors to be seen. These actors are unconscious cognitive processors, that do much of their work automatically. It is only when they are shown upon by the spotlight that they are available to consciousness. In short, consciousness is an attention mechanism. The spotlight has a very restricted causal role in the system, it can be impacted upon by unconscious processes, but it is unclear if it can impact those processes in any way.
One point of discussion with the neurophilosophers was that it is problematic that the viewer looking of the “Dreaming Machine” serves as a homunculus in relation to the unconscious cognitive processes (perception, memory, dreaming) of the artwork. This is because the notion of a homunculus is problematic in itself, as it begs the question, if the homunculus is the consciousness of the greater system, where is the consciousness of the homunculus? This leads to an infinite regression. This is a notion of a causally linked (both impacting those processes and being impacted upon by them) homunculus. Is the homunculus problematic if it cannot causally impact the unconscious processes?
Short of abusing the hardware, the viewer has no direct causal impact on the system, although they can passively effect its memory by being present in its visual field. Since the viewer-as-homunculus has no impact on the unconscious processes, need it be conscious? This seems to solve the homunculus problem, because consciousness is not required. The regress is only solved if we consider the person as not having a homunculus, which means that some cognitive conception is required: Consciousness is an illusion. At this point the two systems as reflections of one and other, two mechanistic collections of unconscious processes. Part of the purpose of this work is to engage in discussion of the implications of these radical cognitive conceptions of mind.
The possibility has been raised that consciousness, although an illusion, may still be required for survival. Something like the virtual world of the “matrix” which is required for the survival of the body.