Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Realtime Detection of handwitten digits

Hello folks,

I want to implement a handwritten digit detector that works in Realtime. I want to draw boxes over the video stream to indicate the user that the digits are detected. Training and Detection is made utilizing a GPU.

Due to the fact I am relatively new to tensorflow and CNNs I investigated some nets and tried to train them with MNIST and some other labeled images I made myself.

Until now I used: SVM, LeNet-5, R-CNN and yolov2

From what I have read so far, I think yolov2 or yolov3 would be an appropriate neural network for the task. Because they are very fast in detection. But there are so many layers and it seems to be very complex.

Do I need to choose such a complex CNN? Originally it was intended for 3D object detection with many classes and I only use it for 2D digits (only 10 classes).

Like I already said, I am new to the topic, so be nice…^^

submitted by /u/DeepStrategy
[link] [comments]

[D]How to deal with semantic segmentation datasets?

In segmentation there is usually a lack of datasets, so I want to use more than one. For example, let’s say I want to segment four classes, cars, buses, bikes, and background. background anything other than the vehicles.

I have three datasets, two of them have ground truths for cars, busses, and bikes but the last one doesn’t label the bikes and just ignore it. I want to use all datasets, is there a trick where I get away with this. I’m using softmax and use logic nor to get the background. It worked using only the first 2 datasets but I want to add the last dataset. I wrote this question and I didn’t get an answer in StackOverflow. Thank you

submitted by /u/blue20whale
[link] [comments]

[D] Generating small graphs using Graph Neural Networks

I have been looking at using Graph Neural Networks as a classifier. The example here: https://towardsdatascience.com/hands-on-graph-neural-networks-with-pytorch-pytorch-geometric-359487e221a8 was a good intro for me – provided with lots of small graphs (Recsys 2015 yoochoose challenge data) can you make a prediction on what they will buy. This seems to get good results (I am unsure how it was appropriate to use a variant of GraphSage though, the documentation recommends it to be used on very large graphs – is there any suggestions as to why this was ok here?).

However, what if I want to go a step further and generate new graphs? How could this be accomplished? One generative graph approach, GraphGAN, is designed to be trained on 1 very large graph, as opposed to lots of smaller ones. Is there work that looks at doing what I am hoping to accomplish?

Thanks

submitted by /u/vaaalbara
[link] [comments]

[P] Voxceleb dataset trained on Mobilenet for speaker recognition and tuned for speaker verification

Thought maybe some people would be interested in this project I worked on last year. I used The voxceleb data to train MobileNet for speech recognition, the sound data is processed into a spectrogram and then the first and second order derivatives are calculated to get 3 dimensional data. After the training was done I used a siamese model technique to tune the features for verification instead of categorization. The idea of the project was to run the model on a smartphone (Hence why I used MobileNet) and use it for speaker verification.

I’m curious what others think about the techniques used and the results, let me know if you are interested in more details!

https://github.com/jpinedaa/Voice-ML (code is messy as hell, I might organize it later)

submitted by /u/ExtremeGeorge
[link] [comments]

[D] Why do Variational Autoencoders encode each datapoint to an individual normal distribution over z, rather than forcing all encodings Z to be normally distributed?

As in the title. Variational autoencoders encode each data sample x_i to a distribution over z, and then minimize the KL divergence between q(z_i |x_i) and p(z), where p(z) is N(0, I). In cases where the encoder does a good job of minimizing the KL loss, the reconstruction is often poor, and in cases where the reconstruction is good, the encoder may not do a good job of mapping onto p(z).

Is there some reason why we can’t just feed in all datapoints from x, which gives us a distribution over all encodings z, and then force those encodings to be normally distributed (i.e. find the mean and stdev over z, and penalize its distance from N(0,I))? This way, you don’t even need to use the reparameterization trick. If you wanted to, you could also still have each point be a distribution, you just need to take each individual variance into account as well as the means.

I’ve tested this out and it works without any issue, so is there some theoretical reason why it’s not done this way? Is it standard practice in variational methods for each datapoint i_i to have its own distribution, and if so, why?

submitted by /u/DearJudge
[link] [comments]