DNN Face Detection Confidence — Part 4

Posted: December 2, 2019 at 7:23 pm

I ran the OpenCV DNN-based face detector while I was working and the results are much better than I previous saw with the jetson-inference example. I presume the difference in the performance is due to the use of a different model. The following plot shows my face run (red) on top of the noFace run from the previous post (blue). The face mean confidence was .935 (compared to the mean noFace confidence of 0.11) and there is a clear gap between confidence where a face is present and where no faces are present, as shown in the plot. It seems this is the method I should use; I’ll try integrating it into my existing code and see how problematic the ability to recognize face profiles is.


DNN Face Detection Confidence — Part 3

Posted: November 27, 2019 at 5:29 pm

Following from the previous post, I tried to load the caffe weights used in this example in the jetson optimized inference example; the model could not be loaded, so I guess the architecture / format was not compatible (they are both caffe models for object detection). On the plus side, I managed to compile and run the DNN face detection code from the opencv examples! The problem was the arguments not being passed properly. (Amazing how many code examples I’m finding that don’t actually work out without modification.)

The good news is that the model and opencv code work very well, actually very very well. In my two hour test with no faces, and a confidence threshold set to 0.1, the max confidence for non faces was only .19! Compare this to the model / jetson inference code, where the same conditions lead to non-faces being recognized with confidence as high as 0.96! The following plot shows the results of the test:

I had to clip the first 1000 or so data samples because my partially visible face was present and that caused spikes in confidence as high as 0.83! The implication is that this detector is much more sensitive to partial / profile faces and that may mean that viewers would have to really look away from the Zombie Formalist for it to generate new images. Technically, I don’t want it to detect profiles as faces. The next stage is to do a test with face present and determine what the range of confidence is and how much of a problem face profile detection causes…


DNN Face Detection Confidence — Part 2

Posted: November 21, 2019 at 5:39 pm

I ran a whole day (~8 hours) test when no one was home with a low threshold of confidence (0.1) for deep face detection. As I had previously seen, non-faces can be attributed with very high confidence values. Before the sunset (leading strangely high confidence in noise) the confidence wavers around quite a lot and the max confidence remains .96.

The following image shows the extreme wavering of confidence over time where no faces are present (blue) shown with the short face test (red). The horizontal lines show the means of face and noface sets. It seems that under certain (lighting) conditions, like the dip below, the DNN reports very low confidence values (0.36) that would be easily differentiated from the true positive faces. Since I’m working with example code, I have not been dumping the frames from the camera corresponding with these values. I may need this to determine under what conditions the DNN does perform well. Tomorrow I’ll run a test while I’m working (with face present) and see if I can make sure there are no false positives and collect more samples. Over this larger data-set I have determined that the bump of no face samples around 0.8 confidence does not happen in appropriate (bright) lighting conditions, see histogram below.

Without more information it’s unclear what confidence threshold would be appropriate or even whether the DNN face detector is indeed performing better than the haar-based detector. This reference showed a significant difference in performance between the DNN and Haar methods, so I’ll see what model they used and hope for better performance using that…


DNN Face Detection Confidence

Posted: November 18, 2019 at 6:17 pm

As I mentioned in the previous post, I was curious whether the DNN method would be any harder to “fool” than the old haar method. The bad news is that a DNN will report quite high confidence when there are no faces, and even in a dark room where most of the signal is actually sensor noise. The following plot shows the confidence over time in face (red) and no face (blue) cases. The no face case involved the sun setting and the room getting dark, which can be seen in the increase of variance of the confidence over time (compared to the relatively stable confidence of the face case. The confidence threshold for the face case was 0.6 and 0.1 for the no face case.

(more…)

Deep Face Detection

Posted: November 17, 2019 at 6:46 pm

Following from my realization that the haar-based classifier is extremely noisy for face detection, I decided to look into deep-network based face detection methods. I found example code optimized for the jetson to do inference using deep models. Some bugs in the code has made it hard to test, but I’ve fixed enough of those bugs to start an early evaluation at least.

On first blush, the DNN method (using the facenet-120 model) is quite robust, but one of the bugs is a reset of the USB camera’s brightness and focus so that makes evaluation difficult. It does appear that there are very very few false positives. Unfortunately there are quite a lot of false negatives also. It does appear that a complex background is a problem for the DNN face detector as it was for the haar-classifier.

I’m now dumping a bunch of confidence values in a context in which I know there is only one face being detected to get a sense of variance… Then I’ll do a run where I know there will be no faces in the images and see what the variance of confidence is for that case. There is also come DNN-based face detection code in OpenCV that looks to be compatible I’m also trying to figure out.


Face Detection Inaccuracy

Posted: November 8, 2019 at 10:09 am

After getting the new rendering code and face detection into an integrated prototype that I can test (and generate training data) I’m realizing the old school haar classifier running on the GPU works very very poorly. Running the system with suitable lighting (I stopped labelling data once the images got too dark) yielded the detection of 628 faces; of those 325 were false positives. This is not great and the complex background did not help, see image below. I did not keep track of the number of frames processed (true negatives), so these numbers appear much worse than they actually are in terms of accuracy. There were likely 1000s of true negatives. In a gallery context there would be much more control of background, but I should try some example code using a trained CNN to detect faces and see how that seems to perform.

False positive in complex background

More Images of Compositions with X and Y Layer Offsets

Posted: November 3, 2019 at 11:14 am

The following image is a selection of some “good” results using the new renderer with 2D offsets.


New Compositions with X and Y Layer Offsets

Posted: October 30, 2019 at 2:08 pm

The following image shows 25 randomly generated compositions where the layers can be offset in both directions. This allows for a lot more variation and also for circles to include radial stripes that do not terminate in the middle. I’m about to meet with my tech, Bobbi Kozinuk, to talk about my new idea for a case design and talk about any technical implications. I’ll also create a prototype that will collect the time I look at each composition as a new data-set for training.


Long-List of Appropriated Paintings

Posted: October 30, 2019 at 11:37 am

The gallery below shows the strongest of all my explorations and refinements of the painting explorations. I’ll use this to set to narrow down to a shortlist that will be finalized and produced. I’m not yet sure about the print media or size, but was thinking normalizing them to ~19″ high to match the height of the Zombie Formalist. This would mean the tallest in this long-list would be ~8.5″ x 19″ (W x H) and the widest ~43″ x 19″. For media, I was thinking inkjet on canvas would emphasize painting.


AA Solution

Posted: October 25, 2019 at 4:02 pm

I ended up adding the padding only to the right edge, which cleans up the hard outer edges of circles, which is where it bothered me the most. I also realized that there were dark pixels around the feathered edges. This was due to a blending error where I was setting a framebuffer to transparent black rather than transparent with the background colour. There are still some jaggies, as shown in the images below, but they are working quite well.

I also made some quick changes realizing that radial lines are never offset inwards or outwards from the circle, this is because offsets were only applied in 1D. I’ve added a second offset parameter for 2D offsets and there is a lot of additional variety. I just realized this also means my previously trained model is no longer useful (due to the additional parameter), but I’ll need to train on some actual attention data anyhow. I’ll post some of those new compositions soon.


AA Edges (Again)…

Posted: October 25, 2019 at 11:24 am

After more testing I realized the padding approach previously posted is includes some unintended consequences; Since all edges had padding, the circles are no longer continuous and the padding introduces a seam where 0 = 360 degrees, as shown in the following image. I also noticed that in some cases the background colour can be totally obscured by the stripes, which makes the padding look like a thin frame in a very different colour than the rest of the composition. In the end, while these changes make the edges look less digital, they introduce more problems than they solve.


AA Edges in Zombie Formalist Renderer

Posted: October 24, 2019 at 12:14 pm

In my test data for machine learning I was not very happy with the results because of strong jaggies, especially in outer edges where the edge of the texture cuts off the sine-wave gradient. I added some padding on each single row layer on the left and right edges and used a 1D shader blur to soften those cut off edges. This works quite well, but as shown below only works on the left and right edges; the top and bottom stay jaggy: (note, due to the orientation of layers, sometimes these ‘outer’ jaggies are radial and sometimes circular.)

(more…)

#9 Exploration

Posted: October 11, 2019 at 9:50 am

This is the last painting on the long-list to be explored! I’ll re-post a gallery of the final images that will be the set that I’ll select from for the final works. The top image is my selection, and I’ve included two explorations below it.


#4 Exploration and Refinement

Posted: October 9, 2019 at 11:29 am

The composition of this painting has a large black hole in the middle. The abstraction process seems to emphasize this and I’m not totally sure by the results. The best image (top one) does seem a little too abstract, but the emphasis on that dark area is reduced. I think I’ll try something in between sigma 500 and 600 if this image makes the final cut. Explorations below.


#12 Exploration

Posted: October 7, 2019 at 11:14 am

I’ve ruled this painting out due to the lack of contrast.


#13 Exploration

Posted: October 5, 2019 at 5:16 pm

I can’t say I find anything really interesting about this one, so I’m ruling it out. Following are my explorations.


#2 Exploration and Refinement

Posted: October 4, 2019 at 1:05 pm

I’m quite happy with these results; the top image significantly diverges from the figure shape which is still dominant in the two explorations below.


#19 Exploration

Posted: October 3, 2019 at 11:08 am

I think these are a little too colourful, but I think the version on the left is sufficient for comparison in the set. I’m getting close to finishing the medium resolution images and I may have to scale down a few high resolution images (paintings 13, 12, 4 and 9), which have been crashing the machine due to memory use.


#6 Exploration

Posted: October 1, 2019 at 4:56 pm

I’m quite fond of how this one turned out, but on closer inspection I realized the image I’m working from is a scan of a half-tone reproduction (see detail below). If this image makes the selection, I’ll have to find a photographic source. The best image is the largest above with two explorations beneath it.

(more…)

#25 Exploration

Posted: October 1, 2019 at 4:44 pm


Final Experiment Using Colour Histogram Features

Posted: September 27, 2019 at 11:29 am

My talos search using a 24 bin colour histogram finished. The best model achieved accuracies of 76.6% (training), 74.6% (validation) and 74.2% (test). Compare this to accuracies of 93.3% (training), 71.2% (validation) and 72.0% (test) for the previous best model using initial features. On the test set, this is an improvement of only ~2%. The confusion matrix is quite a lot more skewed with 224 false positives and only 78 false negatives. Compare this to 191 false positives and 136 false negatives for the previous best model using initial features. As the histogram features would need to be calculated after rendering, I think it’s best to stick with the initial features where the output of a generator can be classified before rendering, which will be much more efficient.

The following images show the new 100 compositions classified by the best model using these histogram features.

“Good” Compositions
“Bad” Compositions
(more…)

#14 Exploration and Refinement

Posted: September 26, 2019 at 9:41 am

I’m quite happy with the results of #14! Selected image on top with a gallery of explorations below.


#17 Is Quite Weak

Posted: September 24, 2019 at 6:24 pm

I can’t say I’m very happy with the results for painting #17. I supposes it’s just far too monochromatic. See explorations below.


Classification using final model

Posted: September 24, 2019 at 5:56 pm

The following two images show the classification done by the final model trained on all the data using the architecture params from hyperparameter search. I think these are slightly better than those from the previous post.

“Good” Compositions
“Bad” Compositions

Looking back through my experiments I thought I would take a crack on one more histogram feature experiment. I saw a peak validation accuracy (using the ruled out problematic method) of 75% with a 24 bin colour histogram, so I thought it would be worth a revisit.


Splits and new classified compositions!

Posted: September 20, 2019 at 7:14 pm

One thing I realized in my previous experiments was that I did not change the train/validate/test split. So I ran a few experiments with different splits, 50/25/25 was my initial choice. I tried 80/10/10, 75/15/15 and 60/20/20. My results showed that 75/15/15 seemed to work the best and I wrote some code to classify new images using that trained model. The following are the results! I think the classification is actually working quite well; a couple compositions I consider “bad” made it in there, but looking at these two sets I’m quite happy with the results.

“Good” Compositions
“Bad” Compositions

My next ML steps are:

  • finalize my architecture and train the final model
  • integrate the painting generator and face detection to run as a prototype that logs looking durations for each composition
  • run some experiments using this new dataset collected in the ‘wild’ and decide on thresholds for mapping from duration of looking to “good” and “bad” labels.
  • finally determine the best approach to running training code on the Jetson (embed keras? use ANNetGPGPU? FANN?) and implement it.

#22 Exploration and Refinement

Posted: September 20, 2019 at 10:32 am

#22 turned out quite well; I’ve included my favourite choice on top and two explorations below. Perhaps it could be a little smoother, but I think its strong enough to serve in order to do the final selection.


Histogram Features Don’t Improve Classification Accuracy

Posted: September 17, 2019 at 4:16 pm

Rerunning the grid search using the 48 bin (16 bins per channel) colour histogram features provided no classification improvement. The search reported a peak validation accuracy of 74% and 83% for the training set. The best model achieved a classification accuracy of 84.6% for training, 70.6% for validation and 72.3% for testing. The confusion matrix for the test set is as follows:

  • 649 bad predicted to be bad.
  • 319 bad predicted to be good
  • 220 good predicted to be bad.
  • 761 good predicted to be good.

So it appears I’ve hit the wall and I’m out of ideas. I’ll stick with the initial (instructional) features and see if I can manage a 75% accuracy for an initial model. Looking back at my experiments, it looks like my validation accuracies have ranged from ~62% to ~75% and test from ~70% to ~74%.

At least all this experimentation has meant that I have a pretty good idea that such a model will work on the Jetson and I will not even need a deep network. I may even be able to implement the network using one of the C++ libraries I’ve already been using like FANN or ANNetGPGPU.


No Significant Improvement Using Dropout Layers nor Changing the Number of Hidden Units.

Posted: September 15, 2019 at 6:13 pm

After the realization that the ~80%+ results were in error, I’ve run a few more experiments using the initial features. Unfortunately no improvement from the ~70% results. I added dropout to input and hidden layers (there was previously only dropout on the input layer) and changed the number of units in the hidden layer (rather than using the same number of inputs). I did not try adding a second layer because I have not seen a second hidden layer improve performance in any experiment; perhaps this is due to a lack of sufficient training samples for deep networks.

The parameter search found a validation accuracy of 73.4%, while the best model showed a validation accuracy of 73.9% and a test accuracy of 71.8%. The network was not over-fit with a training accuracy of 88.1%. The confusion matrix for the test set is as follows:

  • 658 bad predicted to be bad.
  • 291 bad predicted to be good
  • 258 good predicted to be bad.
  • 742 good predicted to be good.

I’m now running a slightly broader hyperparameter search using the 48 bin colour histogram and if I still can’t get closer to 80% accuracy I’ll classify my third (small) data set and see how it looks. In thinking about this problem I did realize that there was always a tension in this project. If the network is always learning its output will become increasingly narrow and never be able to ‘nudge’ the audience’s aesthetic into new territories; there is a need for the system to show the audience ‘risky’ designs to find new aesthetic possibilities. This is akin to getting trapped in local minima; there may be compositions the audience likes even more, but those can only be generated by taking a risk.


#15 Exploration and Refinement

Posted: September 15, 2019 at 5:33 pm

The top image shows my favourite result for #15, which I think is pretty successful; I was not sure how the abstraction of the original (cubist) source would work out. I think this shows sufficient dissolution of the original. Explorations are included in a gallery below.

(more…)

~86% Test Accuracy Appears to be Spurious

Posted: September 13, 2019 at 5:09 pm

After running a few more experiments, it seems the reported near 90% test accuracy is spurious and related to a lucky random split of data that was probably highly overlapping with the training data split. The highest test and validation accuracies I’ve seen after evaluating models using the same split as training are merely ~74% and 71%, respectively.

I did a little more reading on dropouts and realized I had not tried different numbers of hidden units in the hidden layer, so I’m running a new search with different input and hidden layer dropout rates, number of hidden units and some range of epochs and batch_size. If this does not significantly increase test and validation accuracy then I’ll go back to the colour histogram features and if that does not work… I have no idea…