Deep Networks provide no increase of validation accuracy.

Posted: June 12, 2019 at 4:20 pm

After doing quite a bit more learning I used talos to search hyperparameters for deeper networks. I ran a few experiments and in no case could I make any significant improvement to validation accuracy from the simple single hidden layer network. While there is some improvement tuning various hyperparameters, all the tested network configurations resulted in validation accuracy ranging from 61% to 73% (60% to 100% for training data). The following plot shows the range of validation accuracy over the number of hidden layers. Note the jump from 1 to 2 hidden layers does increase validation accuracy, but only an mean increase of 0.3%.

The confusion matrix for the best model is about the same as it was for the single hidden layer model first trained in keras (without hyperparameter tuning!):

Through the last couple of weeks I have made no significant gains with keras. So the problem is clearly my features. Everything I’m seeing seems to indicate that my initial fears in regards to the lack of separability in my initial T-sne results. So I have a few ideas on how to move forward:

  1. Rather than using the raw features for classification, do some initial stats on those features and use those stats for training. This only effects features that can be grouped, e.g. stats on the set of colours of all layers in a composition. Two ideas are variance of such groups of features, or full histograms for each group of features.
  2. Since my features are normalized, they all have the same range. This means that regardless of their labels, all features will have the same stats, making #1 moot! So it looks like I should convert my code so that the features are the actual numbers used to generate compositions and not these 0-1 evenly distributed random numbers. This means generating and labelling a new data-set.