Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

[D] Does The Inability Of NAS Algorithms To Outperform Random Search Indicate That Our Algorithms Suck, Or That Random Search Is Surprisingly Effective In Large Spaces?

One of the most counterintuitive developments in ML research is that, despite huge amounts of resources and brain power being poured into field, state-of-the-art neural architecture search algorithms still can’t outperform pure random search.

This fact is so jarring that I’m surprised it’s not being talked about more often.

What exactly does this mean? Are we just putting out ineffective automl algorithms, or has the power of random search been completely overlooked?

submitted by /u/mystikaldanger
[link] [comments]

[P] AdamWR Full Keras + TF-Keras Implementation Available

A followup to original post (pasted shortened below), with major changes; release v1.1:

  • Run-based weight decay normalization scheme, normalizing over arbitrary # of iterations independent of LR scheduler (e.g. over all epochs)
  • Full compatibility with TensorFlow 2.0.0 and Keras 2.3.0 (keras + tensorflow.keras)
  • Full compatibility with TensorFlow 1.14.0 and Keras 2.2.5 (keras + tensorflow.keras)
  • Also compatible w/ TF 1.13.0 & 1.15.0, Keras 2.2.3-2.2.4

For a complete list of changes, see release notes. Optimizers here.


The latest Lookahead optimizer paper, co-authored by Geoffrey Hinton, used AdamW as its base optimizer, and noted it performing superior to plain Adam.

NadamW and SGDW included, along their WR (Warm Restart) counterparts – with cosine annealing learning rate schedule, and per layer learning rate multipliers (useful for pretraining). All optimizers are well-tested, and for me have yielded 3-4% F1-score improvements in already-tuned models for seizure classification.

submitted by /u/OverLordGoldDragon
[link] [comments]

[N] Interview with Hamel Husain on semantic code search research at GitHub

“We hope that the community can use this dataset to improve developer tools generally, which may include semantic code search. We hope that the state of the art with regards to representation learning of code is advanced because researchers and practitioners now have a common dataset and a forum in which to discuss results. We also hope that the uniqueness of the dataset will inspire the community to uncover new approaches and techniques for code and natural language understanding.”

That’s a quote from the one of the authors of CodeSearchNet – datasets, tools, and benchmarks for representation learning of code. This research on semantic code search has been posted here before as news, but I thought some people here might be interested to know some of the details behind what goes into a project like this at a big company. I interviewed Hamel Husain, a machine learning engineer at GitHub about how the project started and evolved into a wider open source effort to involve the ML research community. Hope there are useful takeaways for people here.

Here’s a link to the interview: https://sourcesort.com/interview/hamel-husain-on-semantic-code-search

And here’s a link to the original paper on arXiv: https://arxiv.org/abs/1909.09436

submitted by /u/Jefro118
[link] [comments]

[D] What are the potential applications of a hypothetical Object Structure Estimation Model

Hi,

Lately I have been trying to control a robot with a puppet (kinda like a voodoo doll) and want to estimate the keypoints on the puppet to recognize the orientation of the head and the wheels so that I can transfer that as commands to the robot. I couldn’t find anything online about estimating keypoints on custom objects.

So that made me think that if a model could exist that can estimate the structural skeleton of objects, what could be it’s potential applications?

One application I can think of is recognizing the orientation of objects and not just detecting them in an image with a bounding box. What more could it be used for?

What are your opinions/thoughts on this?

p.s. I am thinking of this as my thesis for a research paper but just wanted to make sure it is something worth spending time for and knowing its potential applications.

submitted by /u/theneuralbeing
[link] [comments]

[D] Can dense network perform as good as any other architecture?

In a project I am working on currently, team got into a discussion over shall we go for Dense MLP or CNN? That discussion sort of made me wonder the question, “Can Dense MLP work as good as any other architecture (CNN, GCN) for every task?” A proper way of putting it will be, given we are able to properly train a huge dense network with enough expressive power for the task, and we have enough data for proper training, can a dense network perform as good as other network architecture, in theory? From what I understand, different architectures are just different ways of pooling/sharing information and feature extraction. The functions that can be realised by any network should also be realisable by some configuration of Dense network.

submitted by /u/HDidwania
[link] [comments]

[P] ARIMA vs LSTM – Forecasting Weekly Hotel Cancellations

Over the past while, I’ve been working on a side project to forecast hotel cancellations on a weekly basis (original data and authors available here).

While the original intent of this research was to identify the drivers of such cancellations and predict whether customers would cancel using classification (i.e. cancelling customer = 1, non-cancelling customer = 0), I wanted to investigate whether time series forecasting could be a good addition to this study.

The first step was using pandas for data manipulation, i.e. sorting the cancellations by week and then summing up to get the total number of cancellations every week.

Following this, I decided to use both ARIMA and LSTM to predict future cancellations across the test set. This was done for two separate hotel datasets (H1 and H2).

Interestingly, I found that LSTM performed better on the more volatile dataset (H2), while ARIMA showed more forecast accuracy on the dataset with a smoother trend (H1).

Ultimately, doing this project reinforced to me that machine learning models like LSTM are just like any other model – they are not necessarily suitable for all situations, and one needs to understand the data they are working with before selecting the model.

If you’re interested in the findings, feel free to take a further look. It is a three-part study, but here are the relevant links below:

LSTM Forecasts

ARIMA Forecasts (first half of the article covers classification with SVM)

Hope you find this of use, and grateful for any feedback!

submitted by /u/plentyofnodes
[link] [comments]

[D] FYI Machine Learning Conference (MLconf) in San Francisco 11/8

Just wanted to raise awareness and get discussion going if anyone wants to meet up at MLconf (next next Friday, November 8th). Talks will cover topics such as: NLP, Voice Agents, ML & Medical Research, ML & Quantum Computing, ML Models, Data Science for Good, etc.

If you’re not going to be in San Francisco then you can also check out past sessions (going back to 2012) here: https://mlconf.com/sessions/.

If you do want to be there I’d suggest going to eventbrite instead of their website since there’s a discount. Below are the speakers and the topic they’ll speak on (if I could find it):

2019 MLconf SF Speakers:

  • Franziska Bell, Senior Data Science Manager on the Platform Team, Uber – Opening Remarks
  • Anitha Kannan, Founding Member, Curai – AI for healthcare: Scaling Access and Quality of Care for Everyone
  • Xavier Amatriain, CTO, Curai – AI for healthcare: Scaling Access and Quality of Care for Everyone
  • Mihajlo Grbovic, Principal Machine Learning Scientist, Airbnb
  • Josh Wills, Software Engineer, Slack – Data Labeling as Religious Experience
  • Ted Willke, Sr. Principal Engineer, Intel
  • Jekaterina Novikova, Director of Machine Learning, Winterlight Labs – Machine Learning Methods in Detecting Alzheimer’s Disease from Speech and Language
  • Bradley Voytek, Associate Professor, UCSD – The Art of Parameterization
  • June Andrews, AI Instruments, Stitch Fix – The Uncanny Valley of ML
  • Sneha Rajana, Software Development Engineer, Amazon – Deep Learning Architectures for Semantic Relation Detection Tasks
  • Noam Finkelstein, PhD Student, Johns Hopkins University – The Importance of Modeling Data Collection
  • Anoop Deoras, Researcher, Netflix – Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model
  • Jamila Smith-Loud, User Researcher, Google
  • Justin Armstrong, Senior Backend Engineer – Applied ML, Compology – Applying Computer Vision to Reduce Contamination in the Recycling Stream
  • Igor Markov, Facebook/ Professor, University of Michigan
  • Vinay Prabhu, Chief Scientist, UnifyID Inc – Project GaitNet: Ushering in the ImageNet moment for human Gait kinematics
  • Meghanna Ravikumar, Machine Learning Engineer, SigOpt – Optimized Image Classification on the Cheap
  • Martin Isaksson, Co-Founder, PerceptiLabs

Sponsors: PerceptiLabs, Oracle, Apple, Proofpoint, HiringSolved, SigOpt, Medium, Walmart Labs, Compology.

Personally, I’m most looking forward to the healthcare applications they’ll go over, but I’m also curious what “Data Labeling as a Religious Experience” means.

submitted by /u/KernalTrick
[link] [comments]

[N] Newton vs the machine: solving the chaotic three-body problem using deep neural networks

Since its formulation by Sir Isaac Newton, the problem of solving the equations of motion for three bodies under their own gravitational force has remained practically unsolved. Currently, the solution for a given initialization can only be found by performing laborious iterative calculations that have unpredictable and potentially infinite computational cost, due to the system’s chaotic nature. We show that an ensemble of solutions obtained using an arbitrarily precise numerical integrator can be used to train a deep artificial neural network (ANN) that, over a bounded time interval, provides accurate solutions at fixed computational cost and up to 100 million times faster than a state-of-the-art solver. Our results provide evidence that, for computationally challenging regions of phase-space, a trained ANN can replace existing numerical solvers, enabling fast and scalable simulations of many-body systems to shed light on outstanding phenomena such as the formation of black-hole binary systems or the origin of the core collapse in dense star clusters.

Paper: arXiv

Technology Review article: A neural net solves the three-body problem 100 million times faster

submitted by /u/aiismorethanml
[link] [comments]