Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D]Deep learning with Python and Keras – course is different from the book!!

I did a course on Udemy called ‘deep learning with python and keras’ thinking it’s a course version of the similar name book by François Chollet (author of Keras). But it’s not!

It was stupid of me to not realize this earlier. The course is made by some 2 other guys. Having finished the course I think, it’s average at best.

  • The worst part is that the examples they use in the course hardly prove the theory. For example, they make a claim on the influence of batch size on accuracy and rate of learning. But when you run their example it does not prove their point.
  • Some of the data they use is low in volume and quality that minor changes in test/train split or hyperparameter changes gives huge improvements.
  • Another strange thing I noticed was that the test error would be lower than training error in a lot of cases.

Overall I would strongly urge people to learn theory from many other strong courses (like Andrew NG’s) and for Keras, buy the book by the author.

This course was made by two guys, both of who have no experience but teaching experience in ML through these online courses and bootcamps. Neither have academic background not industry experience in ML. Not to say that academic background or industry experience is necessary for being an expert in ML, but if someone doesn’t have either, I’d feel a little skeptical.

submitted by /u/Correct-Mortgage
[link] [comments]

[P]vedaseg: A semantic segmentation toolbox in pytorch

Introduction

vedaseg is an open source semantic segmentation toolbox based on PyTorch.

Features

  • Modular Design
    We decompose the semantic segmentation framework into different components. The flexible and extensible design make it easy to implement a customized semantic segmentation project by combining different modules like building Lego.
  • Support of several popular frameworks
    The toolbox supports several popular and semantic segmentation frameworks out of box, e.g. DeepLabv3+, DeepLabv3, UNet, PSPNet, FPN, etc.

submitted by /u/jackson_ditred
[link] [comments]

[D] ML on resource-constrained devices (MCUs)

Do you think SVM on 8 bit microcontrollers can be of help in the ML-on-the-edge? I can’t understand why anyone talks about ANN on microcontrollers and nobody thought about SVM / Decision Trees / Random Forests… which should be much smaller in size. I wrote a couple posts on the topic and would really appreciate any suggestion on comment on the subject.

https://eloquentarduino.github.io/2019/11/you-can-run-machine-learning-on-arduino/

submitted by /u/EloquentArduino
[link] [comments]

[D] Methods to handle streaming/real-time data storage, wrangling and prediction?

Say that there is data being streamed into Python (Kafka, Kinesis etc) every 10 seconds that I would like to wrangle and predict on. What is the best way to store this streaming data in order to do this? In the past, I have used online learning methods to do this. I am curious how to do this with a batch learning method.

I was thinking we iteratively populate a DataFrame with this data until stream stops, preprocess on the entire dataframe, predict, clear/delete the DataFrame. A caveat of this method that I am able to think of would be scenarios in which this preprocessing and predicting takes longer than 10 seconds.

What are some ways to handle this?

submitted by /u/Straighteight424
[link] [comments]

[D] A potential sneaky strategy to get your ICLR paper accepted [Not Recommended!]

I went through some papers on ICLR open reviews and noticed a sneaky strategy possibly. It looks like if a paper gets a heated discussions with harsh attacks from a pseudonymous commenter, the chair usually sides with the author and accept the paper even against the original reviewers’ scores. Wondering if any author employed such strategy? Quite sneaky!

submitted by /u/thntk
[link] [comments]

[P] Batch Normalization in GANs

Hello everyone. I’ve been working on generating paintings for my Masters thesis. So far I’ve been having a really difficult time training GANs, which is par for the course.

One of the issues I’ve run into so far has been that outputs seem to share the same characteristics i.e all the paintings that it produces are of the same color palette. I’ve used the Least Squares GAN with Spectral Normalization in it. I’ve read that one of the ways of combating this issue is to use Minibatch discrimination but that seems to be making results worse for whatever reason (maybe there’s an optimum number of features you’re supposed to concatenate?).

So my question is to do with Batch Normalization which seems to be the perpetrator of this same color palette/texture issue; is it better just to use Instance Norm/Pixel Norm/Layer Norm instead of Batch Norm? Do those produce good results? I’ve been having a lot of issues with tensorflow so I’d like to know if anybody else has tried these and gotten results out of them. Let’s imagine we’re talking about just a DCGAN with batch norm replaced by any of the above.

As a bonus question, it doesn’t seem to be picking up fine detail. Any tips? (I’ve tried self attention, also not a big help)

P.S Resources are limited, I’m running on a Quadro P1000 for a day at most

submitted by /u/96meep96
[link] [comments]

[P] Are there any good datasets for A4 document detection?

What I need doesn’t necessarily have to be annotated at all, I just need photos of 2D documents to at least visually evaluate a network that’s been trained on synthetic data.

The only one I’ve found so far is the ICDAR 2015 document capture competition dataset. And it’s okay, but contains a lot of compression artefacts (being a video dataset) which introduce some data mismatch so I’m looking for something cleaner.

Thank you.

submitted by /u/uqw269f3j0q9o9
[link] [comments]