Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] How do object detection algorithms and feature extractor networks work together for action detection?

I’m talking about architectures such as AlexNet, Inception and object detectors like YOLO, SSD. I’ve read a bit online and I’m really confused how they work together.
Lets say I want to detect a specific object/person from a video and put a box around them with a label describing the state of that object/person. How would that work? What would be steps taken by the object detector and feature extractor? A workflow for this would be really helpful.

submitted by /u/LessTell
[link] [comments]

[D] Effect of Oversampling on classifiers, when combined with image transformations

I am trying to understand the negative consequences of oversampling in the context of image classification. If I am using a decent amount amount of image transformations, I believe it will effectively be equivalent to SMOTE for tabular data, since I am not exactly repeating any image in a batch. Does the behaviour and test set accuracy of a classifier in any way depend on the actual class distribution in the train set and by oversampling am I doing any harm?

To take an example I was training a classifier on a dataset with 5 classes, having heavy class imbalance. To balance it out I oversampled the minority classes so that all classes have equal number of images. This caused a significant performance drop on the test set that I have, while cross validation performance was fairly high on the oversampled set. When analysing the class distributions I saw that for the original train set the distribution was: 1,3,2,4,5 with decreasing number of samples. The test set has a class distribution 3,1,2,4,5 but the predictions after training on oversampled data have distribution 4,3,2,5,1. Mathematically speaking, how can this behaviour of over predicting a less frequently occuring class be explained?

submitted by /u/Atom_101
[link] [comments]

[D] What are the current SOTA architectures for NLP information extraction & question answering?

Been primarily working in a different field of DL for a while, but got a project coming up related to NLP. I’ve done some research though the most frequent ones that seem to be showing up are GPT-2, BERT, and ELMo. However, I am under the impression that these are burying others that may be better suited for the task.

If it’s of relevance; my domain expertise is in medicine, and intend to use it for medical purposes.

submitted by /u/Naveos
[link] [comments]

[P] For NLP beginners, simple PyTorch implementation of Language Modeling

A step-by-step tutorial on how to implement and adapt simple language model to Wikipedia text.

A pre-trained BERT, XLNET is publicly available ! But, for NLP beginners, It could be hard to use/adapt after full understanding. For them, I covered whole, end-to-end implementation process for language modeling, using recurrent network, we already know.

I hope that this repo can be a good solution for people who want to have their own language model 🙂

https://github.com/lyeoni/pretraining-for-language-understanding

submitted by /u/lyeoni
[link] [comments]

[D] Basic RNN predicting more than 1 timestep /w Keras (Python).

I’ve been working with RNNs for a little while now but prior to dipping my toe in this area I’ve successfully implemented a few basic feed forward models into a production environment. I like to think I understand the premise posed by recurrent topology (GRU, LSTM, for instance). I’m struggling with the basic shape of the data and/or the proper parameters for my training data.

Here’s a basic example I’ve been playing with for many in, one out (omitting the fancy Keras utils that do automatic encoding / mapping):

The Data / imports

import numpy as np from keras.utils import to_categorical, plot_model from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.models import Sequential from keras.layers import Dense, Embedding, LSTM, GRU, Dropout, Flatten from keras import callbacks, regularizers, optimizers X = [ ["a", "b", "c"], ["b", "c", "d"], ["c", "d", "e"], ["d", "e", "f"], ["e", "f", "g"], ["f", "g", "h"], ["g", "h", "i"], ["h", "i", "j"], ] # Maps to each value observation X, 2 seq out y_b = [ ["d", "e"], ["e", "f"], ["f", "g"], ["g", "h"], ["h", "i"], ["i", "j"], ["j", "k"], ["k", "m"] ] # Maps to each value observation X, 1 seq out y_a = [ ["d"], ["e"], ["f"], ["g"], ["h"], ["i"], ["j"], ["k"] ] # Basic function to translate characters to their ordinal offsets decode = lambda char: [chr(i) for i in range(97, 97 + 26)] encode = lambda seq: np.array([[letters.index(i) for i in obs] for obs in seq]) X_encoded = encode(X) y_encoded = encode(y_a) # y_a == predict single timestep, y_b == predict 2 timesteps 

The resulting design matrix should look something like this:

array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]]) 

Reshaping for network

sequence_length = 3 X_reshaped = np.reshape(X_encoded, (len(X_encoded), sequence_length, 1)) X_reshaped = to_categorical(X_reshaped) # This is from keras.util y_cat = to_categorical(y_encoded) 

y_cat ends up one-hot-encoded / binary-like for 1 representing the unique entity on index:

sequence_length = 3 X_reshaped = np.reshape(X_encoded, (len(X_encoded), sequence_length, 1)) ## Experimented with ## X_reshaped = to_categorical(X_reshaped) y_cat = to_categorical(y_encoded) 

The Model

## Network topology model = Sequential() model.add(GRU(64, input_shape = (X_reshaped.shape[1], X_reshaped.shape[2]))) ## This doesn't seem to be necessary # model.add(Flatten()) ## This, I believe sets the assumption about the output in terms of categorical encoding model.add(Dense(y_cat.shape[1], activation="softmax")) rmsprop = optimizers.rmsprop(lr = .1) model.compile(loss="categorical_crossentropy", optimizer=rmsprop, metrics=["accuracy"]) model_params = dict( x = X_reshaped, y = y_cat, epochs = 50, batch_size = 2, verbose = 1, # callbacks = [keras_tensorboard], validation_split = 0.3 ) history = model.fit(**model_params) 

My basic 3 in 1 out (predicting y_a) network works just fine. When I try to predict more than one (y_b), updating my parameters for the 2nd Dense layer, is when I run into problems. Given the shape and the assumptions I’ve made about the network, seem to be incorrect because the library throws an error about the shape of my ground truth (y).

  1. Is this the proper topology for this type of problem?
  2. Is my y encoded improperly?

Of course I’m interested in solving for the multi sequence output but more importantly, I’m hoping to understand “why” more than “how”. Thanks in advance for any advice or help!

submitted by /u/butter-jesus
[link] [comments]

[N] Video Understanding Using Temporal Cycle-Consistency Learning

Blog post here. Excerpt from the blog:

We propose a potential solution using a self-supervised learning method called Temporal Cycle-Consistency Learning (TCC). This novel approach uses correspondences between examples of similar sequential processes to learn representations particularly well-suited for fine-grained temporal understanding of videos. We are also releasing our TCC codebase to enable end-users to apply our self-supervised learning algorithm to new and novel applications.

submitted by /u/__arch__
[link] [comments]