Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[N] Open world RPG with ‘dungeon master ai’ and ‘story engine’ in the works using neural network and machine learning

https://www.youtube.com/watch?v=tw6CUVk4mn0

https://i.imgur.com/EVZjPQs.jpg

The Story Engine that the studio is teasing us with is smart enough to act as your human DM. It would track the repercussions of your actions across the open-world game, it would have NPCs react and respond to your character in an evolving way.

Here is what they said in the video about it:

“Julian and I have worked in multiple situations where we had to design story telling apps that would build a story around your actions. Nobody in the RPG industry as far as I know have spent the time that we have spent inside the educational industry trying to do this and we have spent a lot of time modifying prototyping and building a story engine that we think will change how an RPG is done. We want to do what it felt like when a DM would sit across from you and react to what you did.

It started in infancy at Arena and was very primitive, Daggerfall was more involved and then every once in a while, years after, it was obvious that a story engine would be needed, and it would get more advanced every time I was on educational products. At this point i’ve been tinkering with it on and off for a few decades. It’s gotten pretty good at this point. We hope it will be able to drive stories that would simulate a real person, Dungeon Master specifically, and continually keep the player with some decent stories but also make sure that there is continuation to these stories that would affect the world in a meaningfull way and in a persistant way so there is meaning in the short term to the quests and meaning after the quest is completed. That’s the idea and it looks like after all that time, the technology will be able to do something like that.”

submitted by /u/bugsixx
[link] [comments]

[R] Unsupervised Universal Self-Attention Network for Graph Classification

Paper: https://arxiv.org/pdf/1909.11855.pdf

Abstract:

Existing graph neural network-based models often have weaknesses in exploiting potential dependencies among nodes and graph structure properties. To this end, we present U2GNN, a novel embedding model leveraging on the strength of the recently introduced universal self-attention network (Dehghani et al., 2019), to learn low-dimensional embeddings of graphs which can be used for graph classification. In particular, given an input graph, U2GNN first applies a self-attention computation, which is then followed by a recurrent transition to iteratively memorize its attention on vector representations of each node and its neighbors across each iteration. Thus, U2GNN can address the weaknesses in the existing models in order to produce plausible node embeddings whose sum is the final embedding of the whole graph. Experimental results in both supervised and unsupervised training settings show that our U2GNN produces new state-of-the-art performances on a range of well-known benchmark datasets for the graph classification task. To the best of our knowledge, this is the first work showing that a unsupervised model can significantly work better than supervised models by a large margin.

Contributions:

  • We consider in this paper a novel strategy of using the unsupervised training setting to train a GNN-based model for the graph classification task where node feature and global information will be incorporated.
  • U2GNN can be seen as a general framework where we prove the powerfulness of our model in both the supervised or unsupervised training settings. The experimental results on 9 benchmark datasets show that both our supervised and unsupervised U2GNN models produce new state-of-the-art (SOTA) accuracies in most of benchmark cases.

submitted by /u/daiquocnguyen
[link] [comments]

[R] Soft-Label Dataset Distillation and Text Dataset Distillation

Paper: https://arxiv.org/abs/1910.02551

Code: https://github.com/ilia10000/dataset-distillation

Dataset distillation is a method for reducing dataset sizes by learning a small number of synthetic samples containing all the information of a large dataset. This has several benefits like speeding up model training, reducing energy consumption, and reducing required storage space. Currently, each synthetic sample is assigned a single `hard’ label, and also, dataset distillation can currently only be used with image data. We propose to simultaneously distill both images and their labels, thus assigning each synthetic sample a `soft’ label (a distribution of labels). Our algorithm increases accuracy by 2-4% over the original algorithm for several image classification tasks. Using `soft’ labels also enables distilled datasets to consist of fewer samples than there are classes as each sample can encode information for multiple classes. For example, training a LeNet model with 10 distilled images (one per class) results in over 96% accuracy on MNIST, and almost 92% accuracy when trained on just 5 distilled images. We also extend the dataset distillation algorithm to distill sequential datasets including texts. We demonstrate that text distillation outperforms other methods across multiple datasets. For example, models attain almost their original accuracy on the IMDB sentiment analysis task using just 20 distilled sentences.

This is my first full-length paper and code release so I’d love to get your feedback on the text and the code, especially since this will likely be part of my thesis!

submitted by /u/ilia10000
[link] [comments]

[D][P] Anyone would like to join our NLP project for the Google/Tensorflow online hackathon (deadline Dec 30th) ? It’s a natural language recommendation engine, powered by BERT.

We’re a team of 4 people people working on a natural language recommendation engine, powered by Bert. The hackathon team limit is 6, so we were wondering if there’s any people who may be a great fit for the team. Here are few areas of the project (experience in this is definitely not required).

-Network/Graph data processing/cleaning/preparation.

-Negative sampling data pipelines with Tf.data.

-Incorporating negative sample/candidate sampling into Keras.

-FAISS for embedding similarity lookup, millions of embeddings

-HuggingFace’s Tensorflow2.0/Keras models.

-TPU training with Keras.

Here’s a link to the hackathon

https://tfworld.devpost.com/

If you’re interested, pm me so that we can share backgrounds.

submitted by /u/AdditionalWay
[link] [comments]

[R] How degenerate is the parametrization of neural networks with the ReLU activation function?

Paper (PDF on arXiv)

NeurIPS 2019 Poster

Abstract:

Neural network training is usually accomplished by solving a non-convex optimization problem using stochastic gradient descent. Although one optimizes over the networks parameters, the main loss function generally only depends on the realization of the neural network, i.e. the function it computes. Studying the optimization problem over the space of realizations opens up new ways to understand neural network training. In particular, usual loss functions like mean squared error and categorical cross entropy are convex on spaces of neural network realizations, which themselves are non-convex. Approximation capabilities of neural networks can be used to deal with the latter non-convexity, which allows us to establish that for sufficiently large networks local minima of a regularized optimization problem on the realization space are almost optimal.

Note, however, that each realization has many different, possibly degenerate, parametrizations. In particular, a local minimum in the parametrization space needs not correspond to a local minimum in the realization space. To establish such a connection, inverse stability of the realization map is required, meaning that proximity of realizations must imply proximity of corresponding parametrizations. We present pathologies which prevent inverse stability in general, and, for shallow networks, proceed to establish a restricted space of parametrizations on which we have inverse stability w.r.t. to a Sobolev norm. Furthermore, we show that by optimizing over such restricted sets, it is still possible to learn any function which can be learned by optimization over unrestricted sets.

submitted by /u/julbern
[link] [comments]

[D] Struggled with reading deep learning papers

Actually, as a senior graduate student, I have been doing research in the field of deep learning/nlp for several years. But there is a problem has troubled me a lot during these years. Specifically, a lot of deep learning papers (especially those trying to introduce a new model for some very specific task, for example, reading comprehension, text to SQL, e.t.c) give me a feeling that some design for the model described in the paper is highly engineered and not that intuitive, or in another word, it could have many alternative designs for some module, but few papers really justify why they adopt their specific design in depth. For instance, in a seq2seq setting, some may directly use BERT as the encoder, some may use BERT to generate the embedding for the input sequence first, and then feed the embedding to an LSTM encoder. In fact, this example cannot reveal the problem completely since there can be some other scenarios that have tons of different possible designs that might work, and different papers always adopt their very own design with no much justification.

This really makes me feel extremely bad! First, as a guy who is always eager to know WHY, those papers really can’t answer my question, or maybe it’s just not smart to ask why questions in the context of deep learning model designs. It makes doing research in this field looks more like engineering or even art design, but not science. Secondly, those various designs really impose difficulty in comparing different models. It’s really hard to do control the variables! If one model achieves better performance than the other, it’s hard to tell it is truly due to what the paper claims or some other subtle and tricky designs.

I don’t know is there any other people who feel the same way as me. How should I adjust my mindset for doing research in this field?

submitted by /u/entslscheia
[link] [comments]