Since this project is meant for the complex, ever changing real-world, I’ve moved from the simple toy examples (balls and coasters) to outdoor scenes. Unfortunately things are not working very well. The background subtraction method is not suitable for the real-world context in which the background has little structure. Interactions between foreground and background mean that contours found by OpenCV bare little resemblance to human object segmentation. Lets take an image of a car, in context, as an example:
Although the background subtraction does a very good job of actually segmenting the car, the complexity of the car leads to OpenCV finding very poor contours, none of which corresponds to the whole car:
Turns out the problem was the default value for the maximum size for contour extraction. Changing this value has lead to much better contour extraction, as pictured here:
Still with these changes the diversity of data in the real world (at least as encapsulated in a few hundred test images captured in one day) is too complex to be well clustered. In order to get very good accumulations in the real world would mean focusing on specific objects that have a fixed orientation, and do not change shape. For example many images of the same model of car in the same colour would accumulate very well, but collecting enough images to form an accumulation could only happen in a long term installation setup. In short there is nearly nothing that is moving, has a static shape and orientation, and appears in multiple images captured during one day. To illustrate here is the best and worst accumulations:
I can think of the following possible solutions:
- Collect more images, assuring that the same car model is photographed multiple times, and tune system to deal with cars in particular.
- Change from the background subtraction to a different method of segmentation. In the long term this is the way to go, as it would allow static objects within the context to be perceived as objects.
- Rather than cropping, use some model fitting method that would be both orientation (2D) and position independent. This may lead to the accumulation of elements of different objects that are considered the same in regards to histogram and edge-detection.
One idea that may help with clustering would be to get the histogram of only the area of the contour, and not the whole cropped image.