Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Unstable performance during parameter search (Keras)

Hi all,

I was hoping we could discuss the plot below:

That plot comes from a parameter search using Keras/Tensorflow for a binary classification problem with an unbalanced class distribution (as you can tell from the acc plot, the ratio is about 5:1 negative to positive).

The metric that I am most interested in is Precision, and as you can see in this example it is very unstable, bouncing around wildly between epochs – which obviously doesn’t lend itself to being a good/stable model.

Whilst there is a little overfitting, there doesn’t seem to be too much and I can confirm that the data itself is all properly scaled and normalised.

Although the plot scale is a bit large (sorry) to tell properly, I think what we’d find is that Recall fluctuates in unison with Precision. As Recall bounces upwards, I’d expect Precision to take a dive downwards.

I can’t post the exact model because it’s a parameter search with a wide range of possible configurations, but I’m optimising across a range of network depths, widths, dropouts, shapes, learning rate, etc. I’m using binary_crossentropy as the loss, Elu activations, and Nadam optimizer – though I’ve tried a various others with similar results.

What would be your suggestions for creating a more stable model?

At the moment, the class_weight is set to 0:1, 1:1. I think upping the positive class ratio would somewhat stabilise the model (by increasing recall), but I’m shooting to have a high precision and accepting that my recall will be the trade-off and be somewhat low. For example, I’d be happy with 57% precision at 5% recall. In fact – that’s the exact result I got from a previous parameter search, but it didn’t generalise well to the blind test set, and I’m suspecting that the cause was the unstable epoch-to-epoch precision we’re seeing in this plot (though I can only see the plots for the “current” model being generated, so by the end of the many-hour parameter search all I have is a csv of the final values, with no plots to go along with them).

submitted by /u/Zman420
[link] [comments]