Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

[D] I need ideas on how to use the Google Trends data to build a ML model

Like the title says, i had the idea to use the Google Trends data (both using the site or the unofficial API, if they still work) to train a model of some kind for a university project, but as often happens, when i started working i found out that my ideas were unrealistic or too much ambitious.

I’m not an expert but i know the basics of Keras and TF. The only thing i did was downloading some csvs from the site and using them to predict the present using the data from the past. This kind of elaboration works for periodic data of course (for example i tried “ground zero”). i used simple networks based on LSTM, CNN or MLP.

Knowing that i only have normalized data and monthly reports for 15 years (180 rows, more or less), how can i use one or more of this data? I just need an idea or some kind of reference!

submitted by /u/m-i-n-a-r
[link] [comments]

[News] Megatron-LM: NVIDIA trains 8.3B GPT-2 using model and data parallelism on 512 GPUs. SOTA in language modelling and SQUAD. Details awaited.

Code: https://github.com/NVIDIA/Megatron-LM

Unlike Open-AI, they have released the complete code for data processing, training, and evaluation.

Detailed writeup: https://nv-adlr.github.io/MegatronLM

From github:

Megatron is a large, powerful transformer. This repo is for ongoing research on training large, powerful transformer language models at scale. Currently, we support model-parallel, multinode training of GPT2 and BERT in mixed precision.Our codebase is capable of efficiently training a 72-layer, 8.3 Billion Parameter GPT2 Language model with 8-way model and 64-way data parallelism across 512 GPUs. We find that bigger language models are able to surpass current GPT2-1.5B wikitext perplexities in as little as 5 epochs of training.For BERT training our repository trains BERT Large on 64 V100 GPUs in 3 days. We achieved a final language modeling perplexity of 3.15 and SQuAD F1-score of 90.7.

Their submission is not in the leaderboard of SQuAD, but this exceeds the previous best single model performance (RoBERTa 89.8).

For language modelling they get zero-shot wikitext perplexity of 17.4 (8.3B model) better than 18.3 of transformer-xl (257M). However they claim it as SOTA when GPT-2 itself has 17.48 ppl, and another model has 16.4 (https://paperswithcode.com/sota/language-modelling-on-wikitext-103)

Sadly they haven’t mentioned anything about release of the model weights.

submitted by /u/Professor_Entropy
[link] [comments]

[P] Kannada-MNIST: A new handwritten digits dataset for the Kannada language

[P] Kannada-MNIST: A new handwritten digits dataset for the Kannada language

Dear ML community members,
I’d like to disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset.
In addition to this dataset, I disseminate an additional real world handwritten dataset (with 10k images), which we term as the Dig-MNIST dataset that can serve as an out-of-domain test dataset.

Class-wise mean images for the Kannada-MNIST dataset

  1. I also duly open source all the code as well as the raw scanned images along with the scanner settings so that researchers who want to try out different signal processing pipelines can perform end-to-end comparisons.
  2. I provide high level morphological comparisons with the MNIST dataset and provide baselines accuracies for the dataset disseminated. The initial baselines obtained using an oft-used CNN architecture (96.8% for the main test-set and 76.1% for the Dig-MNIST test-set) indicate that these datasets do provide a sterner challenge with regards to generalizability than MNIST or the KMNIST datasets.
  3. I also hope this dissemination will spur the creation of similar datasets for all the languages that use different symbols for the numeral digits.

ArXiv link: 👉 https://arxiv.org/abs/1908.01242

GitHub link: 👉 https://github.com/vinayprabhu/Kannada_MNIST

Kaggle link: 👉 https://www.kaggle.com/higgstachyon/kannada-mnist

Blog: 👉 https://bit.ly/2H43Vbk

Citation:👉 Prabhu, Vinay Uday. “Kannada-MNIST: A new handwritten digits dataset for the Kannada language.” arXiv preprint arXiv:1908.01242 (2019).

submitted by /u/VinayUPrabhu
[link] [comments]

[Research] What is the State of AutoML in 2019?

https://medium.com/ai%C2%B3-theory-practice-business/what-is-the-state-of-automl-in-2019-64167f581dd1

Abstract—Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a high-quality deep learning system for a specific task is not only time-consuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and up-to-date study on the state-of-the-art AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline. Then we summarize existing Neural Architecture Search (NAS) research, which is one of the most popular topics in AutoML. We also compare the models generated by NAS algorithms with those human-designed models. Finally, we present several open problems for future research.

submitted by /u/cdossman
[link] [comments]

[D] For samples of subsets, features are predictive in gradient boost algorithm.

I’m doing a project with my professor. We are analyzing impact of each feature on model performance, mostly we do it for gradient boost. Professor told me that for samples of subsets features are predictive and it is a problem that we can observe this by analyzing the shallow trees created during the process.

What does “samples of subsets features are predictive” means? I have been searching internet but couldn’t find anything. Any ideas?

submitted by /u/DoIHAVeaNIdenTItY
[link] [comments]

[R] Video Analysis: Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Ever wanted to do a convolution on a Klein Bottle? This paper defines CNNs over manifolds such that they are independent of which coordinate frame you choose. Amazingly, this then results in an efficient practical method to achieve state-of-the-art in several tasks!

https://youtu.be/wZWn7Hm8osA

Abstract: The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry, and which includes many popular methods from equivariant and geometric deep learning. We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.

Authors: Taco S. Cohen, Maurice Weiler, Berkay Kicanaoglu, Max Welling

Paper: https://arxiv.org/abs/1902.04615

submitted by /u/ykilcher
[link] [comments]

[R] Building a Better CartPole – DM’s new RL benchmarking suite

Behaviour Suite for Reinforcement Learning

This paper introduces the Behaviour Suite for Reinforcement Learning, or bsuite for short. bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives. First, to collect clear, informative and scalable problems that capture key issues in the design of general and efficient learning algorithms. Second, to study agent behaviour through their performance on these shared benchmarks. To complement this effort, we open source this http URL, which automates evaluation and analysis of any agent on bsuite. This library facilitates reproducible and accessible research on the core issues in RL, and ultimately the design of superior learning algorithms. Our code is Python, and easy to use within existing projects. We include examples with OpenAI Baselines, Dopamine as well as new reference implementations. Going forward, we hope to incorporate more excellent experiments from the research community, and commit to a periodic review of bsuite from a committee of prominent researchers.

This is a great paper. While the authors focus on comparing different agents, additional value is going to be in debugging algorithm variants. Many researchers already have their own zoos of duct-taped diagnostic envs to try and localise errors, but the community’s been lacking anything ready-made and well-tested.

What is a little disappointing is that they don’t carry this paper through to ‘here we evaluated 17 different agents and this is the best one’, though presumably other contributors will fix that in short order.

submitted by /u/andyljones
[link] [comments]