Learn About Our Meetup

4500+ Members

Category: Reddit MachineLearning

[D] Unstable performance during parameter search (Keras)

Hi all,

I was hoping we could discuss the plot below:

That plot comes from a parameter search using Keras/Tensorflow for a binary classification problem with an unbalanced class distribution (as you can tell from the acc plot, the ratio is about 5:1 negative to positive).

The metric that I am most interested in is Precision, and as you can see in this example it is very unstable, bouncing around wildly between epochs – which obviously doesn’t lend itself to being a good/stable model.

Whilst there is a little overfitting, there doesn’t seem to be too much and I can confirm that the data itself is all properly scaled and normalised.

Although the plot scale is a bit large (sorry) to tell properly, I think what we’d find is that Recall fluctuates in unison with Precision. As Recall bounces upwards, I’d expect Precision to take a dive downwards.

I can’t post the exact model because it’s a parameter search with a wide range of possible configurations, but I’m optimising across a range of network depths, widths, dropouts, shapes, learning rate, etc. I’m using binary_crossentropy as the loss, Elu activations, and Nadam optimizer – though I’ve tried a various others with similar results.

What would be your suggestions for creating a more stable model?

At the moment, the class_weight is set to 0:1, 1:1. I think upping the positive class ratio would somewhat stabilise the model (by increasing recall), but I’m shooting to have a high precision and accepting that my recall will be the trade-off and be somewhat low. For example, I’d be happy with 57% precision at 5% recall. In fact – that’s the exact result I got from a previous parameter search, but it didn’t generalise well to the blind test set, and I’m suspecting that the cause was the unstable epoch-to-epoch precision we’re seeing in this plot (though I can only see the plots for the “current” model being generated, so by the end of the many-hour parameter search all I have is a csv of the final values, with no plots to go along with them).

submitted by /u/Zman420
[link] [comments]

[R] Autonomous Navigation in Unconstrained Environments

While several datasets for autonomous navigation have become available in recent years, they have tended to focus on structured driving environments. This usually corresponds to well-delineated infrastructure such as lanes, a small number of well-defined categories for traffic participants, low variation in an object or background appearance and strong adherence to traffic rules.

I recently worked with IDD, dataset collected from India. It’s relatively more challenging than other autonomous navigation-related datasets (such as Berkeley deep drive or cityscapes) since much of the data has been captured from non standard conditions (drivable areas except roads etc.).

I’m releasing the code for this work, feel free to use it for your projects or research.



submitted by /u/vector_machines
[link] [comments]

[D] Why have we not seen equivalent success in deep learning based image registration?

It seems that other computer vision tasks such as classification, segmentation and synthesis have seen huge advances in accuracy thanks to CNNs, but there seems to be no equivalent advance in image registration. I tried searching for advances in image registration, but it seems that researchers still use ‘classical’ image registration techniques like mutual information, cross-correlation, etc. Even though there are DL image registration research papers, they are not well adopted in the community.

Fundamentally, is there a reason why this task is more complex that the aforementioned ones?

submitted by /u/deep-yearning
[link] [comments]

[R] ABD-Net Person Re-ID code is available

Attention mechanism has been shown to be effective for person re-identification (Re-ID). However, the learned attentive feature embeddings which are often not naturally diverse nor uncorrelated, will compromise the retrieval performance based on the Euclidean distance. We advocate that enforcing diversity could greatly complement the power of attention. To this end, we propose an Attentive but Diverse Network (ABD-Net), which seamlessly integrates attention modules and diversity regularization throughout the entire network, to learn features that are representative, robust, and more discriminative.

submitted by /u/yang-explore
[link] [comments]

[N] HGX-2 Deep Learning Benchmarks: The 81,920 CUDA Core “Behemoth” GPU Server

[N] HGX-2 Deep Learning Benchmarks: The 81,920 CUDA Core “Behemoth” GPU Server

Deep learning benchmarks for TensorFlow on Exxact TensorEX HGX-2 Server.

Original Post from Exxact Here

Notable GPU Server Features

  • 16x NVIDIA Tesla V100 SXM3
  • 81,920 NVIDIA CUDA Cores
  • 10,240 NVIDIA Tensor Cores
  • .5TB Total GPU Memory
  • NVSwitch powered by NVLink 2.4TB/sec aggregate speed



Tests were run on ResNet-50, ResNet-152, Inception V3, VGG-16. Also compared FP16 to FP32 performance, and used batch size of 256 (except for ResNet152 FP32, the batch size was 64). Same tests run using 1,2,4,8 and 16 GPU configurations. All benchmarks were done using ‘vanilla’ TensorFlow settings for FP16 and FP32.

For the full write-up + tables and numbers visit:

submitted by /u/exxact-jm
[link] [comments]

[P] Pytorch library of NLP pre-trained models has a new model to offer: RoBERTa

Huggingface has released a new version of their open-source library of pre-trained transformer models for NLP: pytorch-transformers 1.1.0.

On top of the already integrated architectures: Google’s BERT, OpenAI’s GPT & GPT-2, Google/CMU’s Transformer-XL & XLNet and Facebook’s XLM, they have added Facebook’s RoBERTa, which has a slightly different pre-training approach than BERT while keeping the original model architecture.

The RoBERTa model gets SOTA results on SuperGLUE.

Install: pip install pytorch-transformers


Release notes:


submitted by /u/jikkii
[link] [comments]

[P] GPT-2 small fine-tuned on The Stig intros from Top Gear.

I was struck by how funny the Stig intros generated from this project by u/fsaifdiwq was yesterday. So I wanted to see if I can get a GPT-2 to only do Stig intros. Due to the size of the dataset I can tell that it shows signs of overfitting, but many results are still new and quite fun.

Some say that he is confused by clouds and thinks the wind is behind the clouds. All we know is hes called the Stig.

Some say he has the worlds largest collection of toe nails that he has harvested off his bee hive and that he has a full body tattoo that could make a black hole. All we know is hes called the Stig.

Some say that he once slipped on an orange peel, some say that he ate a porcelain mug, without any tea. All we know is hes called the Stig.

Some say he tastes exactly the same as leprechauns, and that his favourite pastime is crushing other peoples skulls with his glass of milk. All we know is hes called the Stig.

Some say he won the Nobel Peace Prize, and that hes convinced Jesus and The Lion King are The Lion King 1 and 2. All we know is hes called the Stig.

Some say, ”There is a white paper on the problem with trees being bad for him but not for his skin”. Others say his eyelids are white, and that in 2007 he was convicted of the theft of The Lamps. All we know is hes called the Stig.

Some say he once punched a lady in the face and he was confused until he realised people arent that good at sitting still; and that his favorite language is Chinese. All we know is hes called the Stig.

And you can begin with a topic yourself and see how it finishes the introduction. Here I made it mention reddit:

Some say he uses reddit to his own detriment, by posting anti-Donald tweeters. All we know is hes called the Stig.

Some say he uses reddit to look at children’s books and that he is lucky starstruck with Justin Bieber. All we know is hes called the Stig.

Some say he uses reddit to look at nipples and that when he watches children on cell phone video he sees little to no reason to change them. All we know is hes called the Stig.

Some say reddit is his new bathroom partner, and that when he goes to the beach to have a bonfire. All we know is hes called the Stig.

Some say reddit is his social media following and that has not bought the entire worlds supply of earthworms for his new home planet on pluto just yet. All we know is hes called the Stig.

I hope you find them as amusing as I do!

submitted by /u/lilsmacky
[link] [comments]

Next Meetup




Plug yourself into AI and don't miss a beat


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.