#9 Exploration

Posted: October 11, 2019 at 9:50 am

This is the last painting on the long-list to be explored! I’ll re-post a gallery of the final images that will be the set that I’ll select from for the final works. The top image is my selection, and I’ve included two explorations below it.

#04 Exploration and Refinement

Posted: October 9, 2019 at 11:29 am

The composition of this painting has a large black hole in the middle. The abstraction process seems to emphasize this and I’m not totally sure by the results. The best image (top one) does seem a little too abstract, but the emphasis on that dark area is reduced. I think I’ll try something in between sigma 500 and 600 if this image makes the final cut. Explorations below.

#12 Exploration

Posted: October 7, 2019 at 11:14 am

I’ve ruled this painting out due to the lack of contrast.

#13 Exploration

Posted: October 5, 2019 at 5:16 pm

I can’t say I find anything really interesting about this one, so I’m ruling it out. Following are my explorations.

#2 Exploration and Refinement

Posted: October 4, 2019 at 1:05 pm

I’m quite happy with these results; the top image significantly diverges from the figure shape which is still dominant in the two explorations below.

#19 Exploration

Posted: October 3, 2019 at 11:08 am

I think these are a little too colourful, but I think the version on the left is sufficient for comparison in the set. I’m getting close to finishing the medium resolution images and I may have to scale down a few high resolution images (paintings 13, 12, 4 and 9), which have been crashing the machine due to memory use.

#6 Exploration

Posted: October 1, 2019 at 4:56 pm

I’m quite fond of how this one turned out, but on closer inspection I realized the image I’m working from is a scan of a half-tone reproduction (see detail below). If this image makes the selection, I’ll have to find a photographic source. The best image is the largest above with two explorations beneath it.


#25 Exploration

Posted: October 1, 2019 at 4:44 pm

Final Experiment Using Colour Histogram Features

Posted: September 27, 2019 at 11:29 am

My talos search using a 24 bin colour histogram finished. The best model achieved accuracies of 76.6% (training), 74.6% (validation) and 74.2% (test). Compare this to accuracies of 93.3% (training), 71.2% (validation) and 72.0% (test) for the previous best model using initial features. On the test set, this is an improvement of only ~2%. The confusion matrix is quite a lot more skewed with 224 false positives and only 78 false negatives. Compare this to 191 false positives and 136 false negatives for the previous best model using initial features. As the histogram features would need to be calculated after rendering, I think it’s best to stick with the initial features where the output of a generator can be classified before rendering, which will be much more efficient.

The following images show the new 100 compositions classified by the best model using these histogram features.

“Good” Compositions
“Bad” Compositions

#14 Exploration and Refinement

Posted: September 26, 2019 at 9:41 am

I’m quite happy with the results of #14! Selected image on top with a gallery of explorations below.

#17 Is Quite Weak

Posted: September 24, 2019 at 6:24 pm

I can’t say I’m very happy with the results for painting #17. I supposes it’s just far too monochromatic. See explorations below.

Classification using final model

Posted: September 24, 2019 at 5:56 pm

The following two images show the classification done by the final model trained on all the data using the architecture params from hyperparameter search. I think these are slightly better than those from the previous post.

“Good” Compositions
“Bad” Compositions

Looking back through my experiments I thought I would take a crack on one more histogram feature experiment. I saw a peak validation accuracy (using the ruled out problematic method) of 75% with a 24 bin colour histogram, so I thought it would be worth a revisit.

Splits and new classified compositions!

Posted: September 20, 2019 at 7:14 pm

One thing I realized in my previous experiments was that I did not change the train/validate/test split. So I ran a few experiments with different splits, 50/25/25 was my initial choice. I tried 80/10/10, 75/15/15 and 60/20/20. My results showed that 75/15/15 seemed to work the best and I wrote some code to classify new images using that trained model. The following are the results! I think the classification is actually working quite well; a couple compositions I consider “bad” made it in there, but looking at these two sets I’m quite happy with the results.

“Good” Compositions
“Bad” Compositions

My next ML steps are:

  • finalize my architecture and train the final model
  • integrate the painting generator and face detection to run as a prototype that logs looking durations for each composition
  • run some experiments using this new dataset collected in the ‘wild’ and decide on thresholds for mapping from duration of looking to “good” and “bad” labels.
  • finally determine the best approach to running training code on the Jetson (embed keras? use ANNetGPGPU? FANN?) and implement it.

#22 Exploration and Refinement

Posted: September 20, 2019 at 10:32 am

#22 turned out quite well; I’ve included my favourite choice on top and two explorations below. Perhaps it could be a little smoother, but I think its strong enough to serve in order to do the final selection.

Histogram Features Don’t Improve Classification Accuracy

Posted: September 17, 2019 at 4:16 pm

Rerunning the grid search using the 48 bin (16 bins per channel) colour histogram features provided no classification improvement. The search reported a peak validation accuracy of 74% and 83% for the training set. The best model achieved a classification accuracy of 84.6% for training, 70.6% for validation and 72.3% for testing. The confusion matrix for the test set is as follows:

  • 649 bad predicted to be bad.
  • 319 bad predicted to be good
  • 220 good predicted to be bad.
  • 761 good predicted to be good.

So it appears I’ve hit the wall and I’m out of ideas. I’ll stick with the initial (instructional) features and see if I can manage a 75% accuracy for an initial model. Looking back at my experiments, it looks like my validation accuracies have ranged from ~62% to ~75% and test from ~70% to ~74%.

At least all this experimentation has meant that I have a pretty good idea that such a model will work on the Jetson and I will not even need a deep network. I may even be able to implement the network using one of the C++ libraries I’ve already been using like FANN or ANNetGPGPU.

No Significant Improvement Using Dropout Layers nor Changing the Number of Hidden Units.

Posted: September 15, 2019 at 6:13 pm

After the realization that the ~80%+ results were in error, I’ve run a few more experiments using the initial features. Unfortunately no improvement from the ~70% results. I added dropout to input and hidden layers (there was previously only dropout on the input layer) and changed the number of units in the hidden layer (rather than using the same number of inputs). I did not try adding a second layer because I have not seen a second hidden layer improve performance in any experiment; perhaps this is due to a lack of sufficient training samples for deep networks.

The parameter search found a validation accuracy of 73.4%, while the best model showed a validation accuracy of 73.9% and a test accuracy of 71.8%. The network was not over-fit with a training accuracy of 88.1%. The confusion matrix for the test set is as follows:

  • 658 bad predicted to be bad.
  • 291 bad predicted to be good
  • 258 good predicted to be bad.
  • 742 good predicted to be good.

I’m now running a slightly broader hyperparameter search using the 48 bin colour histogram and if I still can’t get closer to 80% accuracy I’ll classify my third (small) data set and see how it looks. In thinking about this problem I did realize that there was always a tension in this project. If the network is always learning its output will become increasingly narrow and never be able to ‘nudge’ the audience’s aesthetic into new territories; there is a need for the system to show the audience ‘risky’ designs to find new aesthetic possibilities. This is akin to getting trapped in local minima; there may be compositions the audience likes even more, but those can only be generated by taking a risk.

#15 Exploration and Refinement

Posted: September 15, 2019 at 5:33 pm

The top image shows my favourite result for #15, which I think is pretty successful; I was not sure how the abstraction of the original (cubist) source would work out. I think this shows sufficient dissolution of the original. Explorations are included in a gallery below.


~86% Test Accuracy Appears to be Spurious

Posted: September 13, 2019 at 5:09 pm

After running a few more experiments, it seems the reported near 90% test accuracy is spurious and related to a lucky random split of data that was probably highly overlapping with the training data split. The highest test and validation accuracies I’ve seen after evaluating models using the same split as training are merely ~74% and 71%, respectively.

I did a little more reading on dropouts and realized I had not tried different numbers of hidden units in the hidden layer, so I’m running a new search with different input and hidden layer dropout rates, number of hidden units and some range of epochs and batch_size. If this does not significantly increase test and validation accuracy then I’ll go back to the colour histogram features and if that does not work… I have no idea…

#24 Exploration and Refinement

Posted: September 13, 2019 at 3:43 pm

I spent a little too much time on #24, but I quite like Yves Tanguy and I thought the muted colour palette here would be interesting. I can’t say I’m happy with the results. I suspect the lack of colour diversity is what causes these to require so many training iterations to obliterate the original. The top image is my favourite, and the gallery below shows the other explorations. I’m next moving onto #15.

#3 Refinement

Posted: September 7, 2019 at 9:39 am

I’ve found it quite difficult to get a version of #3 smooth and without remnants of the original. The image on the top here is closest, even though there is a very small detail in the original which is still visible. Images below were ruled out.

~86% Test Accuracy Using Initial Features?

Posted: September 5, 2019 at 3:58 pm

Following from the previous results using the new workflow, I went back to my initial features (the 52 vector of instructions used to generate compositions). The results are have turned out to be amazing. The best model achieved accuracies of 85.5% (training), 85.6% (validation) and 85.9% (test). This is a significant increase from the previous best result of 79% (validation). These accuracies are means of accuracies reported over five runs with different splits of the data-set. Note, these splits are still 50/25/25 so that the size of the subsets are comparable with previous results. The ‘training’ accuracy, is then not actually the accuracy on the data used to train the network, but the accuracy on a random subset of similar size as the training set. 616 bad compositions were predicted to be bad, 105 bad predicted to be good, 105 good predicted to be bad and 634 bad predicted to be bad. Again, these are averages over multiple predictions with different splits.

As I’m writing this I was thinking that my validation method is problematic. I set aside a test set (during training), to check generalizability beyond the training and validation sets. My validation code is a separate instance and has no access to that specific test split. I need to save that specific test set and then validate the best model based on it, not multiple random runs with random splits. This may be skewing my results, since my random splits use both training and validation samples. So what I need to do is save the split used during training and evaluation and run predictions on them. I’m working on those code changes now…

#1 Refinement

Posted: September 5, 2019 at 10:20 am

I ran a few more iterations appropriating #1 and they are looking quite nice. I think the top image is the most successful, but I’m not convinced by the blueish band near the right edge. I’m happy with the degree of abstraction where the structure breaks away from the figure form which is still visible in the lower image. I’m starting to realize my choice of neighbourhood size seems to be related to the size of faces in the source. Portraits of one person require larger neighbourhoods than group portraits. An interesting side exploration would be to use face detection to automatically determine neighbourhood size for paintings with faces (assuming face detection works well enough on painted faced). I think I’ll leave this one here for now and move along.

Revisiting Older Experiments

Posted: September 3, 2019 at 5:46 pm

After those recent strong results with the changed code, I’m revisiting older experiments to see if the they were in fact showing promise; I’m figuring out whether it was previous features, or the previous validation method that lead to that 70% accuracy ceiling.

The 24 colour histogram feature results do not improve upon the 24 hist + 31 non-colour parameter results. I did learn a few things in the process, including that the stochastic splits change the measured accuracy of the best selected model. From this point I’ll be reporting the mean of accuracy and confusion matrices of 5 runs using different random splits of validation and test data. I also re-ran the evaluation code on the previous experiment with 24+31 features in case the good results were a fluke. Following are the results.

31 + 24 Features

Mean Accuracy:


Mean of Confusion Matrices

375.0 bad predicted to be bad
106.4 bad predicted to be good
112.8 good predicted to be bad
381.8 good predicted to be good

24 Hist Features

Mean Accuracy:


Mean of Confusion Matrices

531.8 bad predicted to be bad.
194.6 bad predicted to be good
155.2 good predicted to be bad.
579.4 good predicted to be good.

So the results are that the 31 + 24 features have performed much better than 24 colour hist features alone. I’m rerunning the initial and variance feature experiments using the new validation method.

#1 and #3 Initial Sketches.

Posted: September 3, 2019 at 10:44 am

As I work my way up in resolution, I’ve generated an initial sketch of #1 and #3. #1 requires a lager neighbourhood to create more abstraction since the original is so well known. #3 also needs more iterations as some of the original painting (God’s face) is still visible. I also tried to do a run of one of the larger paintings, #4, but the process crashed; presumably due to a memory error.

#07 Refinements

Posted: August 30, 2019 at 11:14 am

I’m now setting this aside and moving onto the next images in the short list. The top image is the best result at this time.

#7 Explorations

Posted: August 29, 2019 at 11:17 am

While I’m not quite satisfied with these results, the top image shows what I think of as the most successful iteration; there is still a little of the initial conditions showing in in the faces though, so I’m running another session with slightly more iterations. The gallery below shows all my explorations of #7 up to this point. I’m struggling a little with the tension between smoothness and somewhat uniform colour patches with their harder edges. For this source painting, the patches in the ground can cue camouflage patterns that I’m not keen about.

#23 and #8 Revisited

Posted: August 26, 2019 at 5:33 pm

As I mentioned in the previous post, I wanted to revisit the previously ruled out paintings. I used smaller learning rates to see if that salvaged them. I can’t say I’m happy with the results; although they are more smooth, they are still lacking.

Further Narrowing Down for #5.

Posted: August 25, 2019 at 3:33 pm

After doing a few more runs with tweaked parameters I’m not sure I’m doing much better so I’m going to leave #05 here and re-run the two lower resolution paintings that were previously ruled out (#23 and #18). The first image is the most successful, but is very similar to the those in the top row of the gallery. The bottom row includes the least successful, though I still think there is something to the larger neighbourhood in the lower right image.

Narrowing Down Explorations of #5 With Smaller Learning Rates

Posted: August 24, 2019 at 12:15 pm

After the insight in the previous post, I’ve explored a few variations using learning-rates smaller than 1.0. The following images are my favourites. They balance abstraction and emergent structure quite well, but are not quite there. The image on the left is insufficiently abstract where remnants of the mast in the original are still present. The wave-like structures in the lower left are very interesting and suggest quite a bit of depth and also cue the waves in the original. The image on the right shows quite good abstraction, but lacks some of that complexity in the waves, due to the larger neighbourhood (sigma = 200px).

The following images show the rest of the explorations, including highly over-abstracted versions that approach gradients. I’ve also included an attempt with a relatively high learning rate of 0.5, the highest of these explorations where the rest are 0.25 or 0.1. In that image (upper left of bottom gallery) the wave section in the lower left is very interesting, although approaches the appearance of spires; I’m not sure about the harder edges and mottled patches. That composition also shows a degree of under-organization at the smaller scale, e.g. splashes of red in the area above the bright spot.

Spires and Full Resolution Explorations.

Posted: August 21, 2019 at 12:56 pm

The images above show a few attempts to reproduce the aesthetic of the mid-resolution exploration of #5 at full resolution. As the ‘spires’ clearly overwhelm the image I wrote to the author of ANNetGPGPU. The conclusion is that the interaction of high learning rates and small neighbourhood functions lead to cases where the next BMU is very likely to be close to the previous BMU. The result is a trail of BMUs that progress across the SOM. It is unclear why they always progress at the same angle. I’m now running a test with a learning rate of 0.75 (rather than 1.0 as used previously) and I’ll continue to change learning rates and see how that looks! I may want to also revisit my previously ruled out paintings with this new insight. Now that I know these spires are an emergent result of the SOM, it’s something I should explicitly explore in the future!