Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] Is it possible to hack AutoEncoder?

Re-Ha! (which means Reddit Hi!)

As I wrote down in the title, is it possible to hack Auto-Encoder?

What the heck is ‘hacking AutoEncoder’ then?

Let me give you a simple scenario.

[Scenario]

Suppose Jane extracts latent representation, L, of her private data , X, with three features

(body weight, height, and a binary variable indicating whether she ate pizza or not) daily.

X -> Encoder -> L -> Decoder -> X’ (reconstructed input ~ X)

(X: 3 dim., L: 1 dim., X’: 3 dim.)

She made a simple ML system that continually tracks the three features every day,

trains the AutoEncoder again, and uploads in her private server.

Then, suppose Chris (friend-zoned by Jane a month ago), succeeds in stealing L,
by installing a backdoor program on Jane’s server.

But he doesn’t know the structure of Decoder network, trained weight of Decoder network, and reconstructed input X’.

What he only has is the latent representation of L (continually updated), and the dimension of the original input X.

[Question]

In this situation, is it possible for Chris to retrieve the original input X?

I think it is of course impossible, but then what can be the related mathematical concept supporting the impossibility?

Or, is there any possible method to reconstruct/approximate the original input?

Thank you in advance!

submitted by /u/vaseline555
[link] [comments]

[D] Clustering methodology for high dimensional data, where some features have strong correlations to one another?

Hi, I’m working on a model to cluster users based on their demographic and behavioral features.

Was reading up on some literature on the topic, and found that having strongly correlated features would skew the dimensionality reduction (right now, via PCA) to take only those features with high correlation with each other.

Was thinking of running a simple correlation matrix to remove those features and sort through the clutter before clustering.

But right now, our methodology looks like… 1. Normalizing our features (mean 0, stdev 1) 2. Correlation matrix to weed out some features 3. PCA or some other dimensionality reduction 4. K-Means Clustering

Problem is there are some features we might not be able to cut – category mixes (e.g. user has spent x% on category A, y% on category B, z% on category C, where x+y+z = 100%) ought to still be relevant in our case but will be highly correlated with one another. Any ideas on how we can handle for this?

And as an aside, how do clustering algorithms (K-means specifically) handle nullness?

Would love for you guys’ take on the methodology! All help appreciated on this, thanks!

submitted by /u/ibetDELWYN
[link] [comments]

[D] Learning one representation for multiple tasks – favoring some tasks over others?

Are there any papers on balancing the impact of multiple tasks on the final single representation?

Let there be n tasks to be solved by one representation:

L_total = L_t1 + L_t2 + … + L_tn

If we want to favor one task over another based on prior knowledge is there any other way than setting a lambda type hyperparameter to increase the loss for a specific task?

submitted by /u/searchingundergrad
[link] [comments]

[P] Object detection using faster R-CNN

I am working on a problem where I have to identify small objects in high resolution images, and I was wondering on how to solve this problem.

Basically, I have at most a hundred high resolution images containing tiny objects I need to detect, and I was wondering if I should transform this problem into a supervised problem, where I would first label some images and then try some classification stuff, or whether I apply algorithms such as fast r-cnn.

As I cannot elaborate much more about the topic due to privacy concerns, I would like to know which approach would be the best, or how can I assess which approach to take.

submitted by /u/xOrbitz
[link] [comments]

[P] Training Random Forest with a single vector (for each obs) in h2o?

I’m starting to use h2o to train and serve models. I have a dataset that I’d already curated for Spark ML pipelines. I have a single 16D vector I pass as the training data for each observation.

A friend said that h2o requires columns for each category and treats my single vector as a string, which I just can’t find anything to support. The accuracy is around what I got out of Spark ML, but I’m worried about how h2o is handling my data. Does anyone know how h2o handles this case?

tl;dr – Can I use a single vector for each training observation in h2o?

submitted by /u/Octosaurus
[link] [comments]

[R] SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

[R] SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

Project Page: https://sites.google.com/view/space-project-page

Paper: https://openreview.net/pdf?id=rkl03ySYDH

Abstract: The ability to decompose complex multi-object scenes into meaningful abstractions like objects is fundamental to achieve higher-level cognition. Previous approaches for unsupervised object-oriented scene representation learning are either based on spatial-attention or scene-mixture approaches and limited in scalability which is a main obstacle towards modeling real-world scenes. In this paper, we propose a generative latent variable model, called SPACE, that provides a unified probabilistic modeling framework that combines the best of spatial-attention and scene-mixture approaches. SPACE can explicitly provide factorized object representations for foreground objects while also decomposing background segments of complex morphology. Previous models are good at either of these, but not both. SPACE also resolves the scalability problems of previous methods by incorporating parallel spatial-attention and thus is applicable to scenes with a large number of objects without performance degradations. We show through experiments on Atari and 3D-Rooms that SPACE achieves the above properties consistently in comparison to SPAIR, IODINE, and GENESIS.

Examples:

https://i.redd.it/xesd57isql941.gif

https://preview.redd.it/rguv24ibrl941.png?width=1280&format=png&auto=webp&s=470daf1c544df1d403a885d3db4a4415c3abb50e

https://preview.redd.it/gxa2jukosl941.png?width=1437&format=png&auto=webp&s=9ae306b2e0f397da4cb2d5c938a21c5ff8d390c6

submitted by /u/yifuwu
[link] [comments]

Visualizing Effect of Deep Double Descent on Model “Lottery Ticket” Architecture? [D]

Has anyone done any work on visualizing how the internal “lottery ticket” structure of a neural network changes as it goes through deep-double-descent?

Background:
One popular theory for explaining Deep Double Descent is that double descent occurs as a model truly learns to generalize by finding the “Occam’s Razor” model — the idea that the simplest model to fit the data is the best model for generalizing a solution. This is closely associated with Lottery Ticket Hypothesis and model compression, where you can cull a model’s under-used weights to arrive at a smaller model that provides almost identical accuracy. Lottery Ticket Hypothesis says (roughly paraphrased) that there is a “model within the model” that is the most significant portion of a deep neural network, and once you find that “winning ticket”, then the other nodes in the network aren’t that important.

What I’m wondering is — has there been any work done on visualizing the network architecture of most-significant weights as a model goes through the stages of Deep Double-Descent — from first trough, to plateau, to second descent?

I’m curious to know how much the core “internal architecture” changes in each of those stages, and if we can actually visualize the architecture narrowing in on that “Occam’s Lottery Ticket”…?

submitted by /u/CHerronAptera
[link] [comments]

[D] BERT Large Fine-tune Benchmarks with NVIDIA Quadro RTX 6000 & RTX 8000 GPUs

Hey ML community,

We recently ran a series of benchmark tests showing the capabilities of NVIDIA Quadro RTX 6000 and RTX 8000 GPUs on BERT Large with different batch sizes, sequence lengths, and FP32 and FP16 precision. These were ran using the NVIDIA benchmark script found on their github, and show 1, 2, and 4 GPU configs in a workstation.

RTX 6000 https://blog.exxactcorp.com/nvidia-quadro-rtx-6000-bert-large-fine-tune-benchmarks-with-squad-dataset/

RTX 8000 https://blog.exxactcorp.com/nvidia-quadro-rtx-8000-bert-large-fine-tuning-benchmarks-in-tensorflow/

What types of tests/benchmarks would you like to see ran on these GPUs? What are your thoughts?

Cheers,

JM

submitted by /u/exxact-jm
[link] [comments]