New clip and randomizing order of samples!

Posted: February 7, 2016 at 1:02 pm

As the clip I first choose ended up (a) looking really monochromatic and (b) not having content that refers very strongly to the central essence of Blade Runner, I decided to start working on a different clip. This new clip introduces the main plot-line and the Replicants, so the content is stronger as a proxy for the whole film, and also contains a lot of colour variation. Unfortunately, this clip ended up (after a couple days of processing) just as monochromatic as the first one. The following images show selected original frames on the top and their corresponding reconstructions below.

store-0000061-orig store-0000061

store-0000451-orig store-0000451

store-0002185-orig store-0002185

store-0008451-orig store-0008451

store-0008602-orig store-0008602

After looking at these results it seemed clear that the prominence of blue in the clusters could be related to the fact that the clip starts off being highly blue. I looked back at my code and I was actually sorting the order of samples according to their frame number. I’ve since reimplemented sections that no longer depend on the sample order to determine the input filenames. I presume that the strong precedence of blue for the first 200 frames of the film effect the distribution of clusters. I’m now redoing the clustering using random initial centres, I was previously using the Arthur and Vassilvitskii (2007) method, and randomly ordering the samples such that they are not processed in order of time.

I hope this works out; if not, either I need to use a lot more computational power to get k-means to work well, or switch to a clustering method that does not tend to create equally sized clusters, such as Expectation Maximization (for which there is an opencv implementation).

This also effects the sound so I should include some of these changes over to the audio clustering code.