Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

[N] Deep Graph Library new release (v0.3.1)

Though only a minor release, this new release includes a bunch of very useful Graph Neural Network modules and model examples that can be directly used in your project. Here is a list of new modules:

New NN Modules

New global pooling module

New graph transformation routines

  • dgl.transform.khop_adj
  • dgl.transform.khop_graph
  • dgl.transform.laplacian_lambda_max
  • dgl.transform.knn_graph
  • dgl.transform.segmented_knn_graph

This DGL release also includes a model zoo for chemistry applications such as using GNNs to predict molecular property or generate new molecule structures that is valuable for drug discovery. Pre-trained models are also available for download in simply two lines of codes:

“`python from dgl.data import Tox21 from dgl import model_zoo

dataset = Tox21() model = model_zoo.chem.load_pretrained(‘GCN_Tox21’) # Pretrained model loaded model.eval()

smiles, g, label, mask = dataset[0] feats = g.ndata.pop(‘h’) label_pred = model(g, feats) print(smiles) # CCOc1ccc2nc(S(N)(=O)=O)sc2c1 print(label_pred[:, mask != 0]) # Mask non-existing labels

tensor([[-0.7956, 0.4054, 0.4288, -0.5565, -0.0911,

0.9981, -0.1663, 0.2311, -0.2376, 0.9196]])

“`

Check it out if you are using GNNs, working with molecules or just interested in this whole new field.

See full release note here: https://www.dgl.ai/release/2019/08/28/release.html.

submitted by /u/jermainewang
[link] [comments]

[D] Is “Wasserstein metric” the right name to use?

According to wiki: ” The name “Wasserstein distance” was coined by R. L. Dobrushin in 1970, after the Russian mathematician Leonid Vaseršteĭn who introduced the concept in 1969. “. And indeed, I found the paper written in Russian by Dobrushin, which mentioned in reference:” Л..Н. Васерштейн, Марковские процессы на счетном произведении прост­ранств, описывающие большие системы автоматов. Пробл. перед, информ. 5, 3 (1969), 64—73. “, and Leonid Vaseršteĭn is just english for Леонид Васерштейн.

Although I could not read Russian and I could not find the content of the original papr by Leonid Vaseršteĭn, the wiki still seems convincing.

However, it seems Fréchet distance is identical to 2-Wasserstein distance, and Fréchet distance was introduced in 1957, according to the original French paper “Sur la distance de deux lois de probabilité.

Does it means Fréchet discovered it first and wiki is wrong about the origin? What’s more, should we call it Fréchet distance instead of Wasserstein distance?

P.S.

If you search “Fréchet distance” on google, what comes out is not a distance for distribution but distance for path. I am confused by the relationship between “Fréchet distance of path” with “Fréchet distance of distribution”.

submitted by /u/746645147
[link] [comments]

[R] Evolving Space-Time Neural Architectures for Videos (Google Brain) ICCV

Paper: https://arxiv.org/abs/1811.10636

Code: https://github.com/piergiaj/evanet-iccv19

Abstract:

We present a new method for finding video CNN architectures that capture rich spatio-temporal information in videos. Previous work, taking advantage of 3D convolutions, obtained promising results by manually designing video CNN architectures. We here develop a novel evolutionary search algorithm that automatically explores models with different types and combinations of layers to jointly learn interactions between spatial and temporal aspects of video representations. We demonstrate the generality of this algorithm by applying it to two meta-architectures, obtaining new architectures superior to manually designed architectures. Further, we propose a new component, the iTGM layer, which more efficiently utilizes its parameters to allow learning of space-time interactions over longer time horizons. The iTGM layer is often preferred by the evolutionary algorithm and allows building cost-efficient networks. The proposed approach discovers new and diverse video architectures that were previously unknown. More importantly they are both more accurate and faster than prior models, and outperform the state-of-the-art results on multiple datasets we test, including HMDB, Kinetics, and Moments in Time. We will open source the code and models, to encourage future model development.

submitted by /u/Himalun
[link] [comments]

[D] Eric Drexler’s “Reframing Superintelligence”

Following the Slate Star Codex review of “Reframing Superintelligence” I (as an AI researcher) have become pretty excited to see such a comprehensive reply exists to Bostrom-type “paperclip maximizer” fears of AGI. A good summary here – Less Like Us: An Alternate Theory of Artificial General Intelligence – basically the idea is that realistically AI is not developed with the ability to self improve and do whatever it wants, so we should not fear AGIs that get out of control in this way.

What do you think of this reply to AGI concerns? Certainly given present day AI and how it is developing, the “service ai’ seems like a cogent prediction of what we can actually say is likely to come about and we need to be wary of doing wrong.

submitted by /u/regalalgorithm
[link] [comments]

[R] Google AI Blog: Exploring Weight Agnostic Neural Networks

Google AI Blog: Exploring Weight Agnostic Neural Networks

In “Weight Agnostic Neural Networks” (WANN), we present a first step toward searching specifically for networks with these biases: neural net architectures that can already perform various tasks, even when they use a random shared weight. Our motivation in this work is to question to what extent neural network architectures alone, without learning any weight parameters, can encode solutions for a given task. By exploring such neural network architectures, we present agents that can already perform well in their environment without the need to learn weight parameters. Furthermore, in order to spur progress in this field community, we have also open-sourced the code to reproduce our WANN experiments for the broader research community.

We start with a population of minimal neural network architecture candidates, each with very few connections only, and use a well-established topology search algorithm (NEAT), to evolve the architectures by adding single connections and single nodes one by one.

https://weightagnostic.github.io/

Very interesting results from Google, using evolution-like approach to create network topologies. Thoughts?

submitted by /u/Marha01
[link] [comments]

[D] Do VAEs have a manifold?

I am kind of confused as to how VAEs do manifold learning.

While I can grasp that regular AEs perform deterministic transformation from the input vector space to the latent space with the encoder, it is very hard for me to understand how that would work on a VAE. Is the manifold on the parameters of the distribution MU and SIGMA?

Can anyone clarify that for me, maybe point to a paper? Thanks

submitted by /u/eigenlaplace
[link] [comments]

[R] DistilBERT: A smaller, faster, cheaper, lighter BERT trained with distillation!

HuggingFace released their first NLP transformer model “DistilBERT”, which is similar to the BERT architecture: only 66 million parameters (instead of 110 million) while keeping 95% of the performance on GLUE.

They released a blogpost detailing the procedure with a hands-on.

It is also available on their repository pytorch-transformers alongside 7 other transformer models.

submitted by /u/jikkii
[link] [comments]