Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[P] A Tool for gAnimating Anime with StyleGAN

I’ve been working on a multipart project involving a reimplementation of StyleGAN and a research tool to interact with trained StyleGAN models.

Here’re some example images/gifs from the project:

I published a couple of blogs that go into more detail.

In the first blog, I introduce the project and discuss the results, the implementation, and the training data. I also share the code for the tool and my StyleGAN reimplementation:

https://towardsdatascience.com/animating-ganime-with-stylegan-part-1-4cf764578e

The second part is a tutorial that demonstrates how to use the tool to animate images and detect facial features. In it, I supply a compiled version of the (Windows) tool that can be used to follow the tutorial:

https://towardsdatascience.com/animating-ganime-with-stylegan-the-tool-c5a2c31379d?source=friends_link&sk=eec12e2da8c84b9736d32f697da21689

My background before ML was reverse engineering, so building a tool that simplified visualizing and interacting with the internal representation of a model felt like an important step to understand it better. There were some results I did not expect, like how modifying a single feature map can consistently make the same meaningful changes across many different images (like opening/closing a mouth). Also, the ability of some feature maps to act as facial feature detectors without training labels made me interested in applying the same approach to other types of generative models.

Let me know if you have any questions/comments/corrections/criticisms or know about similar prior work. As this has been a solo project, I’m pretty starved for outside perspectives.

submitted by /u/re_gen
[link] [comments]

[D] How can I understand the Max Pooling Operation in this Paper?

In this paper, the Input got scaled down from 41×81×6 to 40x80x6 with Max Pooling.

How did they exactly do it?

https://arxiv.org/ftp/arxiv/papers/1901/1901.07761.pdf

I want to do something similar done in this paper, but I have 2 Matrixes with 65×49 and 4 with 64×48, since there is a difference with nodes and elements. (There is one more node in each dimension of the Elements)

They look like this:

https://imgur.com/a/7Mtq4s4

The problem is: strains can be only on elements, the volume fraction as well (These are the 4 64×48 Matrixes). The displacement can only be shown with nodes. (These are the 2 65×49 Matrixes)

I was thinking about adding padding to the smaller Matrixes, so that the Dimensions are the same. Is a zero-padding okay in this case, since I make a Max Pooling Operation anyways?

submitted by /u/avdalim
[link] [comments]

[D] What are the best universities/colleges to be an ML prof?

I’m currently at that point at the end of my PhD where I all of a sudden need to start applying for jobs. I’m applying for both post-docs and prof positions. I thought it’s worth a shot even though my guess is that it’s not likely to get a prof position at a high-tier institution straight out of a PhD.

It looks like there are a ton of tenure-track positions open across Europe and North America. How do you choose between them? What are the factors that are important to consider when applying for prof jobs? (e.g. is the number of papers the department gets into NeurIPS a good metric?) And are there any universities that are generally considered the best for ML profs?

submitted by /u/ilia10000
[link] [comments]

[D] Ways to classify text with very low training data?

What techniques does this group feel works best when classifying text with low amounts of training data?

I ask because I recently put together a tutorial that shows how to use TensorFlow Data Pipelines and NLP classification (BERT) and it gets 85% accuracy, but it tends to work best when there are at least 200 examples of a particular class.

I am uncertain if this technique will work if I only have 1 or two examples of training data for a class. For example, it is uncertain if the same approach would be as effective if I have a piece of text that says “I was in a line today for 3 hours”, if I only have 1 or two examples of that text, and if I am trying to classify this as “Long wait times”. Building on what I was saying earlier, I think that this problem made worse when looking at engineering text or text that is specific to a corporation (where it would be difficult to generate the examples or to get Mechanical Turk workers to classify the examples correctly).

What are your thoughts on this? Have you seen better ways to classify text when there are low amounts of data?

submitted by /u/ThinkCritically
[link] [comments]

[R] Accurate and interpretable modelling of conditional distributions (predicting densities) by decomposing joint distribution into mixed moments

I am developing methodology e.g. for very accurate modeling of joint distribution by decomposing in basis of orthonormal polynomials – where coefficients have similar interpretation as (mixed) moments (expected value, variance, skewness, kurtosis …), e.g. to model their relations, time evolution for nonstationary time series.

We can nicely see growing likelihood of such predictions as conditional distributions when adding information from succeeding variables.

While people are used to predicting values, which can be put into excel table, we can get better predictions by modelling entire (conditional) probability distributions – starting with additionally getting variance evaluating uncertainty of such predicted value e.g. as expected value.

Using such orthonormal basis to model density, we can predict its coefficients (“moments”) independently – the difference from standard predicting value is just separately predicting (MSE) e.g. a few moments, here as just linear combination for interpretatbility (could use e.g. NN instead) finally combining them into predicted density.

I have implementation and further develop it – what kind of data could you suggest to use it for? (preferably complex low dimensional statistical dependencies). ML methods to compare it with?

Slides, recent paper, its overview:

https://i.imgur.com/2xNPCIm.png

submitted by /u/jarekduda
[link] [comments]

[D] Statistical Physics and Neural Networks question.

If you look at the theoretical physics literature, there’s a ton of research being done on the statistical physics of neural networks and the statistical physics of deep learning, etc…where they use analogies between spin glasses and condensed matter models to get to all sorts of theoretical results about neural networks.

To be clear, I’m not talking about studies were neural nets were used to model and solve a problem in statistical physics. I’m thinking about the line of research were the mathematics of statistical physics and spin glasses are used as frameworks to analyze the behavior of neural nets, and then arrive at conclusions like “The loss surface of neural nets have this particular topological property” or “CNN show a phase transition when the number of classes jumps from x to y”, etc…..

My question is: Did any of these theoretical results from the analysis of neural nets using methods from physics ever lead to any practical results, such as a faster training algorithm, or improved generalization ability, etc….?

As far as I can tell: No, none of the popular NNet models incorporate results from these physics inspired studies. All the improvements come from purely mathematical insights, or originally from biological insights.

But I might be wrong: Did any of the significant practical developments in NNets and Deep Learning (better activation functions, training algorithms, regularizations methods,…) stem from the statistical physics approaches?

submitted by /u/AlexSnakeKing
[link] [comments]

[D] Is Neural Magic a scam?

I recently learned about this new startup which advertises that they can provide GPU level learning using a CPU. There are already CPU versions of neural network training algorithms. Is neural magic doing false advertisement? What approach are they taking specifically to make the ‘magic’ happen?

submitted by /u/isthataprogenjii
[link] [comments]

[D] why softmax+CE over sigmoid+BCE?

Most of the popular neural network language models use softmax+cross entropy loss during training, which is based on the assumption that only the target label is true, and everything else is false. But isn’t language modeling a multilabel classification task? why sigmoid+BCE isn’t used often?

submitted by /u/DeMorrr
[link] [comments]