Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[P] MLP for Plate recognition

Hello guys, recently I’ve going on a project that my objective is to get pictures from cars, extract its plate, segment each of the containing chars on the plate and then send it into a Neural Network to output me what os written in it.

I’m struggling in choosing the dataset.

My questions are:

  1. Should I train it with two differents datasets? (i.e. MNIST for digits, get a checkpoint and then train with a character dataset afterwards)
  2. Or should I use a dataset which cointain both numbers and chars? (probably this will be better, but I’m having hard times at finding one)

If you guys can help me with links or something I would appreciate

(I can send the github repo link if you guys wish)

submitted by /u/iskulll
[link] [comments]

[D] How do you set up a multi-user data-science workstation?

I’m becoming the de-facto sysadmin for a single machine, to be used as a server for a small group of people. I’ve been using Linux for almost a decade now, but I’ve never set up a server for multiple users.

I imagine I’m not the only person who has done this. My plan is roughly as follows:

  • Making a new user for everyone using useradd.
  • Setting everyone up to use ssh and public-private key pairs so they can connect. (Everyone is on Mac OSC and/or some flavor of Linux.)
  • Installing anaconda to /opt, and modifying everyone’s .bashrc (or, alternatively, modifying /etc/profile if that’s better?)
  • Setting up a default environment with numpy, jupyter-lab, and other common data science tools.
  • Showing everyone how to SSH tunnel jupyter-lab and run it in their browser.
  • Basic security measures (fail2ban, no SSH password auth, etc.) and checking with my organizations IT to make sure things work properly.

I’m still wondering… * Lock the base conda environment, so that people have to make child environments and can’t modify the base. * Is this the best way to set up Anaconda for multiple users? Is /usr/bin a better place to put it? Etc. * Would there be any bottlenecking issues with multiple people using ssh and tunneling on the same port?

submitted by /u/gnulynnux
[link] [comments]

[D] Why does the BERT paper say that standard conditional language models cannot be bidirectional?

In the original Bert paper, it is stated on page 4 (bottom, first column) that:

Unfortunately, standard conditional language models can only be trained left-to-right or right-to-left, since bidirectional conditioning would allow each word to indirectly “see itself”, and the model could trivially predict the target word in a multi-layered context.

It’s not at all obvious to me why, if you have the sentence “I like funny cats”, predicting the word “funny” while conditioning on the fact that it’s preceded “I”, “like” and succeeded by “cats” would be trivial and how the model could “indirectly see the target word”.

I saw this question asked on a number of online platforms but it never got a response. It would be great if someone with a good understanding of this could give an explanation

submitted by /u/StrictlyBrowsing
[link] [comments]

[P] pyTsetlinMachineParallel released – parallel interpretable machine learning with propositional logic.

[P] pyTsetlinMachineParallel released - parallel interpretable machine learning with propositional logic.

Back to bits

Have worked on speeding up the Tsetlin Machine. It turns out that merely synching at the clause level maintains robust inference, with little loss in parallelism. E.g., 25 threads reach the current Fashion MNIST test accuracy peak of 91.49% 18 times faster than a single thread.

pyTsetlinMachineParallel provides a multi-threaded implementation of the Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, and Weighted Tsetlin Machine, with support for continuous features and multi-granular clauses.

https://github.com/cair/pyTsetlinMachineParallel

submitted by /u/olegranmo
[link] [comments]

[D] Deep-Learning The “Hardest” Go Problem in the World

https://blog.janestreet.com/deep-learning-the-hardest-go-problem-in-the-world/

Here’s a recent post about one of our experiments using AlphaZero-like selfplay learning to explore a fun little microcosm within Go that prior Go bots, including superhumanly strong AlphaZero-based bots, completely fail at.

Obviously, this is not an attempt at tackling any grand open challenge, but hopefully is still a useful case study and that touches on what seem to be some of the remaining weaknesses of modern deep RL in games.

Hope you find it interesting!

submitted by /u/icosaplex
[link] [comments]

[D] NLP – Difference between Answer Selection and Answer Re-ranking?

If given a dataset (e.g. InsuranceQA) with a list of questions and a list of passages of candidate answers (each question can be matched with multiple candidate answers) and the task is to rank the top 10 (the ordering is not labeled) most relevant answers (passages) given a query.

I have been reading about possible ways to solve this problem and came across both Answer Selection and Answer Re-ranking techniques.

I would like to know in terms of a NLP tasks, what is the difference between Answer Selection and Answer Re-Ranking?

I am thinking that Answer Re-ranking is where you first use something like BM25 to retrieve a list of say 1000 candidate answers then ranking the relevant answers whereas Answer Selection can use re-ranking to achieve this or not.

submitted by /u/rosamundo
[link] [comments]

[N] Sunil Mallya, on AWS Deepracer & Sagemaker RL

Hi everyone,

Would like to share the latest episode of our Podcast “the Humans of Ai”. The latest episode features Sunil Mallya, who is the Principal Deep Learning Scientist at Amazon Web Services Machine Learning Lab & is one of the key people behind AWS Deepracer & Sagemaker RL.

In this episode we explore the rich interaction that Sunil has had with machine learning over the course of his career including talking about what drew him to the field, his experience as an early phase machine learning entrepreneur, his knowledge in building distributed machine learning systems and finally the two biggest projects Sunil has taken on recently – Sagemaker RL & AWS Deepracer!

We also discuss many other topics including what the Machine Learning scene is like in San Francisco, some real world applications of Reinforcement Learning & what it takes to be a part of the Amazon’s ML Lab.

You can find The Humans of Ai on:

iTunes here:

https://podcasts.apple.com/au/podcast/the-humans-of-ai/id1464995550

Spotify here:

https://open.spotify.com/show/2RY5mcNl0iAs8HUTNbwT0J

Sticher here:

https://www.stitcher.com/s?fid=414486&refid=stpr

Moderators, if you feel that this is not the right place to be posting, feel free to remove & completely understand, but we do feel that this episode in particular has a lot of information that could benefit the Machine Learning community. Cheers & best

submitted by /u/sigmoidp
[link] [comments]