Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] Can a Machine Learn to Write for the New Yorker? (OpenAI finetunes largest GPT-2 on New Yorker articles)

https://www.newyorker.com/magazine/2019/10/14/can-a-machine-learn-to-write-for-the-new-yorker

Clever use of interactive elements in the article.

That said, I am really confused by OpenAI’s PR strategy for GPT-2; they push it heavily as the future of text writing, but all of their GPT-2 related repos are archived w/o any updates expected. (EDIT: Greg Brockman commented on that comment: https://twitter.com/gdb/status/1181253557833486336 )

submitted by /u/minimaxir
[link] [comments]

[D] State Of The Art Activation Function: GELU, SELU, ELU, ReLU and more. With visualization of the activation functions and their derivatives.

https://mlfromscratch.com/activation-functions-explained/
(Intermediate level and above: Probably skip at least the two first headers, to ReLU)

I recently did a long-form post explaining and visualizing the various activation functions. The math is not that complicated, but knowing the ups and downs of each of these activation functions, or just knowledge of their existence, could prove its worth.

Any feedback is appreciated. As I’m sharing what I learn, I create for other people to learn as well. This is not any advanced topic, but it does provide an overview of SOTA activation functions – and to this extent, the plan is to make similar posts for more advanced topics in the future.

submitted by /u/permalip
[link] [comments]

[Discussion] ICLR 2020 Interesting Papers Thread

I tried to browse through the ICLR submissions on OpenReview this past week. After spending a whole day I was not even 5% done. The search tool is quite useless to narrow down your areas of interests. Besides, intersectionality is at an all-time high in the field.

I think it would be better if we could crowdsource interesting papers here. Post papers you have read or about to read here.

I’ll start:

submitted by /u/metacurse
[link] [comments]

[P] The Joy of Neural Painting – Learning Neural Painters Fast! using PyTorch and Fast.ai

TL;DR

Neural Painters are a class of models that can be seen as a fully differentiable simulation of a particular non-differentiable painting program, in other words, the machine “paints” by successively generating brushstrokes (i.e., actions that define a brushstrokes) and applying them on a canvas, as an artist would do.

Neural Painters are based on GANs, which are great generative models but they are known to be notoriously difficult to train, specially due to requiring a large amount of data, and therefore, needing large computational power on GPUs. They require a lot of time to train and are sensitive to small hyperparameter variations.

To overcome these known GANs limitations and to speed up the Neural Painter training process, we leveraged the power of Transfer Learning.

The main steps are as described as follows:

(1) Pre-train the Generator with a non-adversarial loss, e.g., using a feature loss (also known as perceptual loss)

(2) Freeze the pre-trained Generator weights

(3) Pre-train the Critic as a Binary Classifier(i.e., non-adversarially) using the pre-trained Generator (in evaluation mode with frozen model weights) to generate `fake` brushstrokes. That is, the Critic should learn to discriminate between real images and the generated ones. This step uses a standard binary classification loss, i.e., Binary Cross Entropy, not a GAN loss

(4) Transfer learning for adversarial training (GAN mode): continue the Generator and Critic training in a GAN setting. Faster!

submitted by /u/bluebalam
[link] [comments]

[D] Documented code for reproducible experiments on meta-learning algorithms

Hi everyone! I worked for 6 months on meta-learning algorithms for few-shot computer vision (classifying images or detecting objects with few examples). The code for my experiments in now public:

https://github.com/ebennequin/FewShotVision

It’s fully documented, and you should be able to launch your own experiments in a transparent and reproducible way. Please tell me if I can improve it!

submitted by /u/etienne_ben
[link] [comments]

[D] Best practice when dealing with feature pairs with strong Pearson Correlation scores

Lets say we have some features pairs that have strong Pearson Correlation scores that are:

  • Exactly +1 or -1 (lets call this T1 pairs)
  • Very close to being +1 or -1 or above a threshold (T2)
  • Correlating very closely like the T2s but with multiple other features but those features that its correlating with are not correlating with each other suspiciously (T3)

Lets call the threshold past which we say features pairs are T2 or T3 the TRESH.

Lets also make the following assumptions about these suspicious feature pairs:

  1. They are not One Hot Encoded or some kind of ordinal encoding
  2. They are all floating point numbers with high variances
  3. At least one of them has good correlation with the label(s)

What I would like to discuss is the following:

Options with T1, T2 and T3:

  1. Drop the one with a bad or lower correlation with the label(s)
  2. Drop one regardless
  3. Drop both and replace with a new feature that combines both: interaction

Options with T3:

  1. Drop the common feature if it is correlating badly or worse with the label(s) than any of the other features its correlating strongly with
  2. Drop the common feature regardless
  3. Drop all and interact the common feature with each of its buddies
  4. Drop all and interact the entire group with each other

Options with THRESH:

  1. Always a constant value (specify the value)
  2. A low custom value when you want to do feature reduction and a high custom value for when you want the most descriptive features only

Sample strategies:

  • T11 T21 T31 THRESH: 0.8 means drop worst T1 then drop worst T2 then drop worst T3 at constant suspicion threshold of +0.8 and -0.8
  • T11 T31 T23 THRESH: 2 means drop worst T1 then drop worst in T3 then interact remaining T2s with each other at a suspicion threshold that depends on what you seek to accomplish with the current dataset

Notice how the order matters. Please use this convention to make it fast and easy to understand what you think is best and then add your reasons. Feel free to suggest new options. I will add them to the post.

submitted by /u/times_of_change
[link] [comments]

[P] A step-by-step Policy Gradient algorithms Colab + Pytorch tutorial

Hi, ML redditors! I and my colleagues made a Reinforcement Learning tutorial in Pytorch which consists of Policy Gradient algorithms from A2C to SAC. In addition, it includes learning acceleration methods using demonstrations for treating real applications with sparse rewards:

  1. A2C
  2. PPO
  3. DDPG
  4. TD3
  5. SAC
  6. DDPG from Demonstration
  7. Behavior Cloning (with DDPG)

Every chapter contains both theoretical backgrounds and object-oriented implementation, and thanks to Colab, you can execute them and render the results without any installation even on your smartphone!

I hope it will be helpful for someone. 🙂

Cheers.

submitted by /u/syee-kim
[link] [comments]