Clustering Test 8 (45,000 frames, 2000BG, 1000FG)

Posted: April 30, 2013 at 8:46 am

This is the first long term test covering material over an approximately 12 hour period, shown in the following plot. In this case there are 2000 foreground clusters and 1000 background clusters. The mean rendering time is 2.6seconds with a variance of 0.28.

45000_debug_2000bgc_1000fgc

The next plot shows the histograms for the mean values of the masks (confidence) and number of merges for each percept:

45000_summary_2000bgc_1000fgc

Note that the number of FG merges does not really change compared to previous runs with fewer frames, as presumably there is little repetition (at least in terms of machine readable and fast-computable features). Following are composites each showing 100 percepts all stacked on top of another:

Foreground

outputFG-0 outputFG-1 outputFG-2 outputFG-3 outputFG-4 outputFG-5 outputFG-6 outputFG-7 outputFG-8 outputFG-9

Clearly the clusters attempt to incorporate highly diverse material. While the mean colour of two foreground regions may be within the threshold, they may be highly different in terms of image content. Additionally, the range of aspect ratios means that images are disproportionally scaled to fit, exaggerating the discontinuity. This could be solved by constraining merges to images with similar aspect ratios. With these images in mind, consider the small number of merges done by the system. One thought is that the similarity threshold is too wide, leading to dissimilar images being clustered. As these new clusters are the basis of future clustering (once the max number of clusters has been reached) this may lead to muddy clusters that don’t differentiate sufficiently from each other. For the next test I have made the distance threshold more strict, and added constraints on the range of aspect ratios and areas of merged percepts.

Background

outputBG-0 outputBG-1 outputBG-2 outputBG-3 outputBG-4 outputBG-5 outputBG-6 outputBG-7 outputBG-8 outputBG-9 outputBG-10 outputBG-11 outputBG-12 outputBG-13 outputBG-14 outputBG-15 outputBG-16 outputBG-17 outputBG-18 outputBG-19

The most obvious attribute of background percepts is that they still tend to have hard rectangular edges that occur when masks exceed the boundaries of the image. At first I thought this could be resolved simply by increasing the size of the images, providing some padding space around segmented regions, but this will just increase the size of the visible boxes because these hard edges are the result of the merging of percepts with highly varying masks. It seems the best approach would be to subtract a soft vignette from each mask so that the edges are always soft, even when the mask would have lead to a hard edge. The difficulty with this approach is that we would like a vingette that would not be highly obvious. A soft Gaussian mask is temping, but would likely only be successful for highly square (aspect = 1) percepts.

A mock-up could be quite straight-forward to try. The second thing to note is the darkness of these images. Percepts appear to be quite dark, which is because the most recently processed frames were captured at sunset. As new percepts are always added to the end of the list, percepts toward the back of the list tend to be more recent. The above images are ordered by their position in the list, so the first image is dominated by percepts collected (and merged) earlier in time. Currently, merges are done such that existing content is only 50% of the updated merge, which greatly emphasizes the most recently added percept. This should be more like 66% or even 75% existing content, leaving newly merged percepts to be only 33% or 25% of the cluster.

The following images show all the background (top) and foreground (bottom) percepts:

percepMontageFG percepMontageBG