Partial Sorting with Fewer Large Percepts

Posted: October 11, 2019 at 5:56 pm

The following images are a few variations on shifting a smaller subset of large percepts that are shifted to the upper 3/4 (top) and middle (bottom) of the stack. Of these partial stacked explorations, I think the top image is the strongest yet, but it still seems too uniform compared to the unsorted version. I have a couple more ideas for variations, but so far nothing is an improvement on the unsorted version.


#9 Exploration

Posted: October 11, 2019 at 9:50 am

This is the last painting on the long-list to be explored! I’ll re-post a gallery of the final images that will be the set that I’ll select from for the final works. The top image is my selection, and I’ve included two explorations below it.


Partial Sorting

Posted: October 10, 2019 at 11:06 am

Looking back at the sorted and unsorted versions I realized I like aspects of both of them. The sorted version certainly has more flow, but the smaller segments all on top obliterate any photographic reading. The unsorted version is strong because the larger photographic segments are readable, but they also interrupt the flow. The images below show intermediary versions where the segments are sorted, but then a subset of the largest segments are shifted up in the stack (rendering order) to increase the photographic readability. I still think the unsorted version is strongest, probably due to the variation of texture over the image. In these explorations below, the sorting by area means the texture is quite inform. I may try a few more variations where a much smaller section of large percepts moved up higher; in the images below 25% and 15% of the largest percepts are inserted into the middle, top quarter and top 15% of the stack, respectively. I think these partial sorted variations are more intentional than the randomly shuffled explorations; I want larger segments to be near the top, but perhaps not at the top.


#04 Exploration and Refinement

Posted: October 9, 2019 at 11:29 am

The composition of this painting has a large black hole in the middle. The abstraction process seems to emphasize this and I’m not totally sure by the results. The best image (top one) does seem a little too abstract, but the emphasis on that dark area is reduced. I think I’ll try something in between sigma 500 and 600 if this image makes the final cut. Explorations below.


#12 Exploration

Posted: October 7, 2019 at 11:14 am

I’ve ruled this painting out due to the lack of contrast.


#13 Exploration

Posted: October 5, 2019 at 5:16 pm

I can’t say I find anything really interesting about this one, so I’m ruling it out. Following are my explorations.


#2 Exploration and Refinement

Posted: October 4, 2019 at 1:05 pm

I’m quite happy with these results; the top image significantly diverges from the figure shape which is still dominant in the two explorations below.


#19 Exploration

Posted: October 3, 2019 at 11:08 am

I think these are a little too colourful, but I think the version on the left is sufficient for comparison in the set. I’m getting close to finishing the medium resolution images and I may have to scale down a few high resolution images (paintings 13, 12, 4 and 9), which have been crashing the machine due to memory use.


#6 Exploration

Posted: October 1, 2019 at 4:56 pm

I’m quite fond of how this one turned out, but on closer inspection I realized the image I’m working from is a scan of a half-tone reproduction (see detail below). If this image makes the selection, I’ll have to find a photographic source. The best image is the largest above with two explorations beneath it.

(more…)

#25 Exploration

Posted: October 1, 2019 at 4:44 pm


Final Experiment Using Colour Histogram Features

Posted: September 27, 2019 at 11:29 am

My talos search using a 24 bin colour histogram finished. The best model achieved accuracies of 76.6% (training), 74.6% (validation) and 74.2% (test). Compare this to accuracies of 93.3% (training), 71.2% (validation) and 72.0% (test) for the previous best model using initial features. On the test set, this is an improvement of only ~2%. The confusion matrix is quite a lot more skewed with 224 false positives and only 78 false negatives. Compare this to 191 false positives and 136 false negatives for the previous best model using initial features. As the histogram features would need to be calculated after rendering, I think it’s best to stick with the initial features where the output of a generator can be classified before rendering, which will be much more efficient.

The following images show the new 100 compositions classified by the best model using these histogram features.

“Good” Compositions
“Bad” Compositions
(more…)

#14 Exploration and Refinement

Posted: September 26, 2019 at 9:41 am

I’m quite happy with the results of #14! Selected image on top with a gallery of explorations below.


#17 Is Quite Weak

Posted: September 24, 2019 at 6:24 pm

I can’t say I’m very happy with the results for painting #17. I supposes it’s just far too monochromatic. See explorations below.


Stacking Order by Orientation Sort.

Posted: September 24, 2019 at 6:17 pm

I thought I would try sorting the segments according to orientation, rather than area, but the results look about the same as the shuffled version.


Classification using final model

Posted: September 24, 2019 at 5:56 pm

The following two images show the classification done by the final model trained on all the data using the architecture params from hyperparameter search. I think these are slightly better than those from the previous post.

“Good” Compositions
“Bad” Compositions

Looking back through my experiments I thought I would take a crack on one more histogram feature experiment. I saw a peak validation accuracy (using the ruled out problematic method) of 75% with a 24 bin colour histogram, so I thought it would be worth a revisit.


Meeting the Universe Halfway: Chapter 4 – Agential Realism

Posted: September 23, 2019 at 3:37 pm

I finally got to reading Karen Barad’s book (titled above) and thought I would post my notes here while I reflect on them. After reading I also realized that I had gotten Bohm and Bohr confused in my notes from the Karen Barad Seminar; this has now been corrected. In parallel with the collage production one idea is to reconsider my current Artist Statement and rewrite it to be consistent with Agential Realism. Next, I think I’m going to read Chapter 7 to focus on what is meant by “entanglements”. My notes on chapter 4 are as follows:

(more…)

Refinement of 3,000,000 Training Iteration Version

Posted: September 23, 2019 at 12:30 pm

Since the previous post, I’ve focused on developing of the 3,000,000 iteration version. I was not happy with the shuffled version, shown below on the right of the 3,000,000 iteration version. I prefer the balance of large photo-readable segments and small segments that emphasize flow in the left (previously posted) version.

Following this I generated a sorted version of this composition where larger segments are behind the smaller segments; this emphasizes greater flow, but at the expense of photo-readable segments being visible. I’ve included the sorted version and a few details below. I was just thinking that perhaps I could include a small subset of the large (or medium) segments in the front of the small ones by manipulating of their order in a more complex way; for example, randomly select a few segments from the large end and insert them on the small end?


Explorations with 3,000,000 Training Iterations.

Posted: September 22, 2019 at 8:57 pm

After the previous explorations I thought I would focus on the 3,000,000 iteration collages and generated two more options. I still think the previous work is the strongest. I’m going to now generate unsorted, sorted and shuffled versions of that previous composition and decide which is most successful.


Fewer Iterations and Random Shuffling

Posted: September 21, 2019 at 10:13 am

Following from previous collages I thought I would try fewer iterations (100,000) and a randomly shuffling the stacking order of percepts. I can’t say I’m happy with these results; the most recent iteration is still the strongest. I’ve included below a few of these explorations. I’m now calculating a couple variations with 3,000,000 training iterations. I’m also going to focus on Barad and (re)framing my thinking about objects in relation to how I’ve been thinking about Machine Subjectivity. This will manifest rewriting my artist statement, and I’ve also been playing with the idea of the artist statement as indeterminate where the specific language is manifested as multiple permutations.


Splits and new classified compositions!

Posted: September 20, 2019 at 7:14 pm

One thing I realized in my previous experiments was that I did not change the train/validate/test split. So I ran a few experiments with different splits, 50/25/25 was my initial choice. I tried 80/10/10, 75/15/15 and 60/20/20. My results showed that 75/15/15 seemed to work the best and I wrote some code to classify new images using that trained model. The following are the results! I think the classification is actually working quite well; a couple compositions I consider “bad” made it in there, but looking at these two sets I’m quite happy with the results.

“Good” Compositions
“Bad” Compositions

My next ML steps are:

  • finalize my architecture and train the final model
  • integrate the painting generator and face detection to run as a prototype that logs looking durations for each composition
  • run some experiments using this new dataset collected in the ‘wild’ and decide on thresholds for mapping from duration of looking to “good” and “bad” labels.
  • finally determine the best approach to running training code on the Jetson (embed keras? use ANNetGPGPU? FANN?) and implement it.

#22 Exploration and Refinement

Posted: September 20, 2019 at 10:32 am

#22 turned out quite well; I’ve included my favourite choice on top and two explorations below. Perhaps it could be a little smoother, but I think its strong enough to serve in order to do the final selection.


Histogram Features Don’t Improve Classification Accuracy

Posted: September 17, 2019 at 4:16 pm

Rerunning the grid search using the 48 bin (16 bins per channel) colour histogram features provided no classification improvement. The search reported a peak validation accuracy of 74% and 83% for the training set. The best model achieved a classification accuracy of 84.6% for training, 70.6% for validation and 72.3% for testing. The confusion matrix for the test set is as follows:

  • 649 bad predicted to be bad.
  • 319 bad predicted to be good
  • 220 good predicted to be bad.
  • 761 good predicted to be good.

So it appears I’ve hit the wall and I’m out of ideas. I’ll stick with the initial (instructional) features and see if I can manage a 75% accuracy for an initial model. Looking back at my experiments, it looks like my validation accuracies have ranged from ~62% to ~75% and test from ~70% to ~74%.

At least all this experimentation has meant that I have a pretty good idea that such a model will work on the Jetson and I will not even need a deep network. I may even be able to implement the network using one of the C++ libraries I’ve already been using like FANN or ANNetGPGPU.


No Significant Improvement Using Dropout Layers nor Changing the Number of Hidden Units.

Posted: September 15, 2019 at 6:13 pm

After the realization that the ~80%+ results were in error, I’ve run a few more experiments using the initial features. Unfortunately no improvement from the ~70% results. I added dropout to input and hidden layers (there was previously only dropout on the input layer) and changed the number of units in the hidden layer (rather than using the same number of inputs). I did not try adding a second layer because I have not seen a second hidden layer improve performance in any experiment; perhaps this is due to a lack of sufficient training samples for deep networks.

The parameter search found a validation accuracy of 73.4%, while the best model showed a validation accuracy of 73.9% and a test accuracy of 71.8%. The network was not over-fit with a training accuracy of 88.1%. The confusion matrix for the test set is as follows:

  • 658 bad predicted to be bad.
  • 291 bad predicted to be good
  • 258 good predicted to be bad.
  • 742 good predicted to be good.

I’m now running a slightly broader hyperparameter search using the 48 bin colour histogram and if I still can’t get closer to 80% accuracy I’ll classify my third (small) data set and see how it looks. In thinking about this problem I did realize that there was always a tension in this project. If the network is always learning its output will become increasingly narrow and never be able to ‘nudge’ the audience’s aesthetic into new territories; there is a need for the system to show the audience ‘risky’ designs to find new aesthetic possibilities. This is akin to getting trapped in local minima; there may be compositions the audience likes even more, but those can only be generated by taking a risk.


#15 Exploration and Refinement

Posted: September 15, 2019 at 5:33 pm

The top image shows my favourite result for #15, which I think is pretty successful; I was not sure how the abstraction of the original (cubist) source would work out. I think this shows sufficient dissolution of the original. Explorations are included in a gallery below.

(more…)

~86% Test Accuracy Appears to be Spurious

Posted: September 13, 2019 at 5:09 pm

After running a few more experiments, it seems the reported near 90% test accuracy is spurious and related to a lucky random split of data that was probably highly overlapping with the training data split. The highest test and validation accuracies I’ve seen after evaluating models using the same split as training are merely ~74% and 71%, respectively.

I did a little more reading on dropouts and realized I had not tried different numbers of hidden units in the hidden layer, so I’m running a new search with different input and hidden layer dropout rates, number of hidden units and some range of epochs and batch_size. If this does not significantly increase test and validation accuracy then I’ll go back to the colour histogram features and if that does not work… I have no idea…


#24 Exploration and Refinement

Posted: September 13, 2019 at 3:43 pm

I spent a little too much time on #24, but I quite like Yves Tanguy and I thought the muted colour palette here would be interesting. I can’t say I’m happy with the results. I suspect the lack of colour diversity is what causes these to require so many training iterations to obliterate the original. The top image is my favourite, and the gallery below shows the other explorations. I’m next moving onto #15.


#3 Refinement

Posted: September 7, 2019 at 9:39 am

I’ve found it quite difficult to get a version of #3 smooth and without remnants of the original. The image on the top here is closest, even though there is a very small detail in the original which is still visible. Images below were ruled out.


~86% Test Accuracy Using Initial Features?

Posted: September 5, 2019 at 3:58 pm

Following from the previous results using the new workflow, I went back to my initial features (the 52 vector of instructions used to generate compositions). The results are have turned out to be amazing. The best model achieved accuracies of 85.5% (training), 85.6% (validation) and 85.9% (test). This is a significant increase from the previous best result of 79% (validation). These accuracies are means of accuracies reported over five runs with different splits of the data-set. Note, these splits are still 50/25/25 so that the size of the subsets are comparable with previous results. The ‘training’ accuracy, is then not actually the accuracy on the data used to train the network, but the accuracy on a random subset of similar size as the training set. 616 bad compositions were predicted to be bad, 105 bad predicted to be good, 105 good predicted to be bad and 634 bad predicted to be bad. Again, these are averages over multiple predictions with different splits.

As I’m writing this I was thinking that my validation method is problematic. I set aside a test set (during training), to check generalizability beyond the training and validation sets. My validation code is a separate instance and has no access to that specific test split. I need to save that specific test set and then validate the best model based on it, not multiple random runs with random splits. This may be skewing my results, since my random splits use both training and validation samples. So what I need to do is save the split used during training and evaluation and run predictions on them. I’m working on those code changes now…


#1 Refinement

Posted: September 5, 2019 at 10:20 am

I ran a few more iterations appropriating #1 and they are looking quite nice. I think the top image is the most successful, but I’m not convinced by the blueish band near the right edge. I’m happy with the degree of abstraction where the structure breaks away from the figure form which is still visible in the lower image. I’m starting to realize my choice of neighbourhood size seems to be related to the size of faces in the source. Portraits of one person require larger neighbourhoods than group portraits. An interesting side exploration would be to use face detection to automatically determine neighbourhood size for paintings with faces (assuming face detection works well enough on painted faced). I think I’ll leave this one here for now and move along.


Revisiting Older Experiments

Posted: September 3, 2019 at 5:46 pm

After those recent strong results with the changed code, I’m revisiting older experiments to see if the they were in fact showing promise; I’m figuring out whether it was previous features, or the previous validation method that lead to that 70% accuracy ceiling.

The 24 colour histogram feature results do not improve upon the 24 hist + 31 non-colour parameter results. I did learn a few things in the process, including that the stochastic splits change the measured accuracy of the best selected model. From this point I’ll be reporting the mean of accuracy and confusion matrices of 5 runs using different random splits of validation and test data. I also re-ran the evaluation code on the previous experiment with 24+31 features in case the good results were a fluke. Following are the results.

31 + 24 Features

Mean Accuracy:

Training78.5%
Validation79.5%
Testing77.5%

Mean of Confusion Matrices

375.0 bad predicted to be bad
106.4 bad predicted to be good
112.8 good predicted to be bad
381.8 good predicted to be good

24 Hist Features

Mean Accuracy:

Training75.9%
Validation76.1%
Testing75.3%

Mean of Confusion Matrices

531.8 bad predicted to be bad.
194.6 bad predicted to be good
155.2 good predicted to be bad.
579.4 good predicted to be good.

So the results are that the 31 + 24 features have performed much better than 24 colour hist features alone. I’m rerunning the initial and variance feature experiments using the new validation method.