Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

[D] Legality of Scraping Training Data from Google Images

I think my original post was removed because I didn’t tag it.

I have a project in mind. I want to build an image classifier with novel classes. For example, lets say I want to classify images of different types of bicycles. Google images is ripe with these images for each type of bike.

I want to publish a blog post about my project, and put my code (including scraper) on github but not upload the image files anywhere. I might put up a (free) endpoint hosting my resulting classifier if it works.

Questions:

  1. Are all images on google images fair game for training data or do I have to limit it to images “labelled for reuse”?
  2. Do I have to cite the images I use as training data?
  3. I’ve read about “fair use”, how does that figure in here?

Thanks, and sorry if this has been covered elsewhere

submitted by /u/am_i_having_fun
[link] [comments]

[R] Learning to Predict Without Looking Ahead: World Models Without Forward Prediction (NeurIPS2019)

Recent work from a group at Google Brain.

Abstract

Much of model-based reinforcement learning involves learning a model of an agent’s world, and training an agent to leverage this model to perform a task more efficiently. While these models are demonstrably useful for agents, every naturally occurring model of the world of which we are aware—e.g., a brain—arose as the byproduct of competing evolutionary pressures for survival, not minimization of a supervised forward-predictive loss via gradient descent. That useful models can arise out of the messy and slow optimization process of evolution suggests that forward-predictive modeling can arise as a side-effect of optimization under the right circumstances. Crucially, this optimization process need not explicitly be a forward-predictive loss. In this work, we introduce a modification to traditional reinforcement learning which we call observational dropout, whereby we limit the agents ability to observe the real environment at each timestep. In doing so, we can coerce an agent into learning a world model to fill in the observation gaps during reinforcement learning. We show that the emerged world model, while not explicitly trained to predict the future, can help the agent learn key skills required to perform well in its environment.

web article: https://learningtopredict.github.io

arxiv: https://arxiv.org/abs/1910.13038

submitted by /u/milaworld
[link] [comments]

[D] Can we please just STOP talking about Siraj in this subreddit?

I get it; He is a terrible and shitty person for stealing, plagiarizing, and profiting off of it. However, it’s starting to turn into TMZ in this subreddit with the childish, cancel culture with zero, productive actions. I come her to read about cool research and everyone’s neat projects that they would love to share. I like when people have questions about a paper, or are wanting feedback on their projects. Can we just ban his videos / content?

submitted by /u/one_pump_trump
[link] [comments]

[D] Is there any way to classify text based on some given keywords using python?

Hi, I been trying to learn a bit of machine learning for a project that I’m working in and at the moment I managed to classify text using SVM with sklearn and spacy having some good results, but i want to not only classify the text with svm, I also want it to be classified based on a list of keywords that I have. For example: If the sentence has the word fast or seconds I would like it to be classified as performance.

I’m really new to machine learning and I would really appreciate any advice.

submitted by /u/KOWZDK
[link] [comments]

[P] Would like some ideas for a student project based on city data

Hello everyone I have a project for my AI machine learning course that will be based on city data, namely Vancouver. The below link are the data sets that our project can be based upon. We are free to use outside data sets but must relate it to our city.

https://opendata.vancouver.ca/explore

I would really appreciate any guidance or input. We’re having a difficult time coming up with ideas, the only ones we have come up with are predicting of house property, bike theft, general crime prediction all of which would combine features from other data sets.

Thank you for reading my post!

submitted by /u/JohnMcClapperson
[link] [comments]

[D] I need to interpolate some of my data and have some design decisions about where in my pipeline this should happen.

I am working with a database that is spread across 7 tables and for ML stuff I need to join them together. However, as some of these rows are not sampled as frequently as others, this leaves a lot of nulls for some features. I want to interpolate these values but I’m not sure the most efficient way to do so. In other words, let’s say I have feature X sampled every 10 ms and feature Y every 1 second, and a third feature Z sampled every 15 seconds. I could store it in the database, but I don’t know if allowing that kind of storage capacity is feasible for us. Alternatively, I could calculate it for each row when I get batches for training, but I’m afraid that will become a bottleneck depending on how fast the interpolation is. Is there some obvious way of interpolating this efficiently that I’m not thinking of that will allow me to save on memory space?

submitted by /u/zcleghern
[link] [comments]