Finally Cracked the 70% Validation Accuracy Wall!

Posted: August 21, 2019 at 12:46 pm

I changed my Talos code to explicitly include a best model selection call, running Predict(), and added a call to do 10-fold cross validation of models, running Evaluate() before saving the search session. It is not quite clear to me whether these two actions change the criteria by which models are selected for deployment, but in my first use of these calls my performance has jumped 10%.

I also split my data differently; data is split into 50/25/25% for training, validation and testing. The validation set is used in Talos Scan() and the testing set is used in Evaluate(). The features of this last session were using 31 features from the initial dataset (instructions to generate compositions, excluding colour data) and 25 colour histogram features. I was also wondering if the number of dimensions of my features meant I was not going to get anywhere with as few samples as I have.

The best model reported an accuracy of 78.4% on the training set, 80% on the validation set and 77% on the testing set. This indicates a huge improvement and makes me wonder if Talos was just selecting a very poor ‘best model’ previously. One caveat is that the log Talos generates that shows performance during training shows very different results; in the log, the greatest accuracy was reported as 56.8% on the validation set and 100% on the training set, highly divergent from the prediction accuracy made by the best model. I should also note that I removed the fixed RNG seeds for splits and data shuffling, so the search is stochastic and may be getting a broader picture since it’s not limited by reproducibility. The best model using the validation set predicted 304 bad compositions to be bad, 70 bad to be good, 74 good to be bad and 284 good to be good.

If I can reproduce this performance, I’ll then generate a new set of random compositions and see how the best model classifies them.