Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] BatchNorm alternatives 2019

The main reason why people BatchNorm despite being compute heavy (~25% of total model) is because of fast official cudnn implementations. Same reason why RNNs other than LSTM and GRU never went popular.

Also, BatchNorm requires computing square root and division which require full precision to work properly. Going half-precision or applying quantization is not easy.

Anyway, are there any new methods that can dethrone BatchNorm entirely? Some papers:

Equinormalization https://openreview.net/forum?id=r1gEqiC9FX

Generalized Hamming Network https://arxiv.org/abs/1710.10328

submitted by /u/tsauri
[link] [comments]

[D] Are conferences interested in papers that introduce new datasets?

I’ve been working with a new dataset and running standard models on it. Would any conferences be interested in a paper introducing this NLP dataset detailing what’s special about it, results of current sota methods?

A bit on this data: each example has text, code, and a place in a graph so it acts as a task where there are methods for each of these types of data but few for all combined. Could be interesting for someone working with NLP or GNNs. Essentially, there are a lot of complex relationships within this data that I haven’t seen other datasets match.

submitted by /u/searchingundergrad
[link] [comments]

[D] Specific tips on Machine Learning research in a PhD

I am a new Machine Learning PhD and my topic is roughly vision, i.e. semantic/instance segmentation, and to be honest I am a little lost.

How exactly, specifically do you conduct research in this field? How does the day to day work look like?

  • Do you think of new NN architectures and test them experimentally?
  • Do you download others models and just try them out with own datasets?
  • How do you keep track on different architectures, papers, etc. Maybe make an excel document with all the papers you’ve read with a short summary?

I would be really interested in how the day to day work of other researchers in the field looks like and what specific tips you might have.

submitted by /u/schrowawey
[link] [comments]

[D] Inter-annotator agreement: how does it work for computer vision?

We have a dataset which we need to annotate: the task is object detection, thus we need to create bounding boxes. We’re going to use

https://github.com/wkentaro/labelme

But I’mm open to alternative suggestions, if you think there are better tools. Since the dataset is very large and very confidential, we’re going to annotate it in-house. I’ve heard of people trying to estimate the error due to subjectivity/mistakes in human annotation, but I don’t quite understand how it works. Let’s suppose for the sake of example that I have 900 images and 3 annotators. If I understand correctly, rather than partitioning the dataset in three subsets of size 300 and sending each subset to a different annotator, I divide it in three datasets of size, say, 330, which means that some images will necessarily be annotated by multiple users.

I don’t understand how to use these multiple annotations in practice, though: when I prepare my dataset, for each image which has been annotated by multiple users I’ll have to choose which annotations to use. It’s not like I can have three different bounding boxes (three different ground truths) for each object in the image. So, how does it work in practice?

submitted by /u/arkady_red
[link] [comments]

[N] Deep Graph Library new release (v0.3.1)

Though only a minor release, this new release includes a bunch of very useful Graph Neural Network modules and model examples that can be directly used in your project. Here is a list of new modules:

New NN Modules

New global pooling module

New graph transformation routines

  • dgl.transform.khop_adj
  • dgl.transform.khop_graph
  • dgl.transform.laplacian_lambda_max
  • dgl.transform.knn_graph
  • dgl.transform.segmented_knn_graph

This DGL release also includes a model zoo for chemistry applications such as using GNNs to predict molecular property or generate new molecule structures that is valuable for drug discovery. Pre-trained models are also available for download in simply two lines of codes:

“`python from dgl.data import Tox21 from dgl import model_zoo

dataset = Tox21() model = model_zoo.chem.load_pretrained(‘GCN_Tox21’) # Pretrained model loaded model.eval()

smiles, g, label, mask = dataset[0] feats = g.ndata.pop(‘h’) label_pred = model(g, feats) print(smiles) # CCOc1ccc2nc(S(N)(=O)=O)sc2c1 print(label_pred[:, mask != 0]) # Mask non-existing labels

tensor([[-0.7956, 0.4054, 0.4288, -0.5565, -0.0911,

0.9981, -0.1663, 0.2311, -0.2376, 0.9196]])

“`

Check it out if you are using GNNs, working with molecules or just interested in this whole new field.

See full release note here: https://www.dgl.ai/release/2019/08/28/release.html.

submitted by /u/jermainewang
[link] [comments]

[D] Is “Wasserstein metric” the right name to use?

According to wiki: ” The name “Wasserstein distance” was coined by R. L. Dobrushin in 1970, after the Russian mathematician Leonid Vaseršteĭn who introduced the concept in 1969. “. And indeed, I found the paper written in Russian by Dobrushin, which mentioned in reference:” Л..Н. Васерштейн, Марковские процессы на счетном произведении прост­ранств, описывающие большие системы автоматов. Пробл. перед, информ. 5, 3 (1969), 64—73. “, and Leonid Vaseršteĭn is just english for Леонид Васерштейн.

Although I could not read Russian and I could not find the content of the original papr by Leonid Vaseršteĭn, the wiki still seems convincing.

However, it seems Fréchet distance is identical to 2-Wasserstein distance, and Fréchet distance was introduced in 1957, according to the original French paper “Sur la distance de deux lois de probabilité.

Does it means Fréchet discovered it first and wiki is wrong about the origin? What’s more, should we call it Fréchet distance instead of Wasserstein distance?

P.S.

If you search “Fréchet distance” on google, what comes out is not a distance for distribution but distance for path. I am confused by the relationship between “Fréchet distance of path” with “Fréchet distance of distribution”.

submitted by /u/746645147
[link] [comments]