Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] Please help me find this paper (Foundations of DL)!

Dear all,

I have been searching for some time now for a paper I read a while ago but misplaced.

The paper was very interesting and showed that finding optimisation with with deep neural networks is, in some sense, easier than with shallow neural networks.

The authors generated a data set by generating random input data and then using the predictions of a shallow neural network (A) to provide the ground truth labels of those data. They then tried to train another shallow network (B) with same architecture as (A) the one that created the labels, but with different initializations. It was shown that it was very difficult to find the optimal solution for this dataset. They then tried the same task with a deeper network (C) and found the optimal solution.

If anyone knows the name of this paper then please let me know where I can find it. I would be eternally grateful!

Many thanks!

submitted by /u/HenryWJReeve
[link] [comments]

[R] New Graph Classification Data Sets

Graph classification has been popular recently, which led to rich development of Graph Kernels and Graph Neural Networks. All papers more or less verify the results on 10-15 benchmark data sets. We found that these data sets (and 40 others) have a lot of isomorphic graphs which leads to (1) train-to-test leakage and (2) incorrect validation comparison. Absurdly, some isomorphic graphs have different classification labels, making it impossible to classify correctly such instances. We explain the reasons why these isomorphic instances appear in data sets in the first place (e.g. meta-data, sizes of graphs, or origin of a data set) and open-source new clean data sets, both in GitHub and in PyTorch-Geometric.

Here is a link to the paper: https://arxiv.org/abs/1910.12091

Here is more informal blog post about findings.

submitted by /u/nd7141
[link] [comments]

[R]Theoretical research paper in GAN’s

Hello,

I am doing some research about GAN’s, and I am looking for some mathematical/theoretical articles. I have noticed a lot of papers presenting new types of GAN’s with (sometimes) just some minor alterations to the original. Two papers that fall within the category that I am looking for are:

‘On the limitations of First-Order Approximation in GAN Dynamics’ -> https://arxiv.org/pdf/1706.09884.pdf

‘Which training methods for GANs do actually converge?’ -> https://arxiv.org/pdf/1801.04406.pdf

They both start from a simple model and are then able to mathematically prove some properties and then to empirically demonstrate them.

I hope the question is clear, and thank you in advance!

submitted by /u/Mushr00mParadise
[link] [comments]

[D] Regression tasks with “duplicate samples”

Assume there is a data set {(x_i, y_i)}, 0<=i<n, there exist some samples that have the same x value but different y values (x_i == x_j && y_i != y_j) because there is noise when collecting data.

A common method maybe grouping them with a single y, like their mean value.

But are there any researches resolving this kind of data without grouping them together?

If yes, what is this kind of problems called? Or some keywords for doing search.

Thank you in advance!

submitted by /u/Doo0oog
[link] [comments]

[D] To use triplet loss or not when classes labels are given. Question about theoretical/experimental expectations.

Hi, I look for some theoretical (or experimental) evidences for the superiority (or not) of the triplet loss over cross-entropy loss. Do you know some research papers which try to benchmark following setup?

  • Let’s say we a fixed dataset which contains M images with annotated labels e.g. MNIST dataset.
  • Then we train two models (with same architecture), one with regular categorical cross entropy and second one using triplet-loss approach (or contrastive) etc.

Since the dataset and model architectures are fixed (I assume all other hyperparameters are also fixed, maybe expect learning rate and number of epochs), we will have two models trained to minimize different objectives. I wonder if there is some common knowledge to answer following questions:

  • can we expect one of the approaches to have better test accuracy ?
  • can we expect one of the approaches to better generalize for new classes (e.g. not present in the training dataset) ? I mean, triplet loss was first used for face recognition, so one would expect that embeddings generated from model trained with triplet loss should be more useful for finding new classes.
  • are there other expected differences ? (I’m aware that model trained with triplet loss requires different methodology for measuring performance)

TLDR: different objective function should result in different models. Can we expect the performance differences without training any model?

submitted by /u/kmkolasinski
[link] [comments]

[Discussion] Next AI winter? Do you think that deep learning solutions will be used in production?

ML/AI is gaining traction, but actually not a lot of solutions are being deployed in production, especially in corporations. In startups sure, because very often the main product is AI-based, but I have serious doubts whether regular corporations will continue investing in machine learning engineers. There is still enthusiasm, but not a lot of successful stories, both in CV and NLP (https://medium.com/tooploox-ai/how-to-optimize-your-operations-with-ai-nlp-d3180f2670e3).

submitted by /u/jasabdom
[link] [comments]

[P] Trained a StyleGAN on all emojis from emojipedia. Samples posted hourly. Please help describing samples!

I set up a twitter bot posting a new sample every hour here: https://twitter.com/EmojiPainter

I am trying to collect a dataset with images and associated descriptions. I am hoping to collect human generated descriptions to augment the existing emojis (too few to train an image – language model on).

The images can be unintentionally hilarious but also sometimes disturbing! Enjoy!

submitted by /u/nehalemlabs
[link] [comments]

[P] DialogPT: State of the Art Conversational Model with Fine-Tuned GPT-2 (Microsoft Research)

https://github.com/microsoft/dialogpt

According to Microsoft:

The human evaluation results indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test.

Example conversation:

User: Does money buy happiness?

Bot: Depends how much money you spend on it.

User: What is the best way to buy happiness?

Bot: You just have to be a millionaire by your early 20s, then you can be happy.

User: This is so difficult!

Bot: You have no idea how hard it is to be a millionaire and happy . There is a reason the rich have a lot of money

submitted by /u/anustretch
[link] [comments]