Author: torontoai

[D] Debugging model performance discrepancy between offline eval and online exp

Written on December 18, 2019. Posted in Reddit MachineLearning.

I got the chance to have an interview with an online ads company. The interviewer asked me a question

“if we expect a newly trained model to perform well in online exp, but exp result is pretty negative, how to debug? “

My answer is “may be caused by overfitting. if so, can change the models, e.g. if using decision tree, can switch to random forest”.

The interviewer seems not very satisfied with the answer as he says switching model is heavy weight. I then answered that it could be feature or data distribution discrepancy. Then he asked how to debug these two cases. I am a little stuck.

Want to know some of your opinions?

submitted by /u/marksteve4
[link] [comments]

PointRend: Image Segmentation as Rendering

Written on December 18, 2019. Posted in Reddit MachineLearning.

submitted by /u/AlleUndKalle
[link] [comments]

[Discussion] Glove Paper Question

Written on December 18, 2019. Posted in Reddit MachineLearning.

The Glove paper states the following:

Our final model should be invariant under this relabeling, but Eqn. (3) is not.

Why is Equation 3 not invariant under the relabeling?

Which relabeling are they referring to?

Thanks!

submitted by /u/Lightbringer_Book
[link] [comments]

[R] Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

Written on December 18, 2019. Posted in Reddit MachineLearning.

submitted by /u/hardmaru
[link] [comments]

[D] Q about “Conv Seq 2 Seq learning” paper

Written on December 18, 2019. Posted in Reddit MachineLearning.

I’m reading this https://arxiv.org/pdf/1705.03122.pdf

On page 3, 2nd paragraph I don’t understand what W and b_w are. I get that the inputs are a (k x d) matrix, but how is the convolution performed and why is the output Y size (1 x 2d)?

submitted by /u/ME_PhD
[link] [comments]

[Discussion] How to estimate conditional probability (cdf) of multivariate dataset?

Written on December 18, 2019. Posted in Reddit MachineLearning.

Hi,

I am sharing the problem I face in Matlab but if you have a solution for this problem even in Python then I would very very happy.

I was able to estimate conditional probability (CDF) for a dataset that has two features (X_1 and Y) i.e., P(X_1|Y) using a Matlab function called “quantilePredict”. It works great. However, when I consider three features X_1, X_2 and Y. Then how can I find the P(X_1,X_2|Y) without the assumption of conditional independence?

How to capture the covariance as well as the CDF while considering quantiles but not mean of the data? Worst case I am fine with how to capture the covariance as well as CDF with mean of the data?

TreeBagger is trained (f) by giving “Y” as input and X_1 as output i.e., X_1 = f(Y). We then use the Treebagger model to predict responses for “quantilePredict” but in multivariate case, the Treebagger cannot fit the data where the input is “Y” and output has “X_1, X_2” i.e., Y = f(X_1,X_2) (this idea/pov is probably wrong and naive) ?.

submitted by /u/askquestion001
[link] [comments]

[P] Simple and effective phrase finding in multi-language?

Written on December 18, 2019. Posted in Reddit MachineLearning.

Dueling with out-of-vocabulary word or phrases is been a problem on nlp, sometime using deep learning cost too much.

Maybe we can use a simple statistic way first, finding potential phrases base on word boundary.

how?

there is a drop on the boundary of phrases in a sentence, for example, one of the sentence in attention is all you need:

…multi-head attention in three different ways…

multi-head — frequency 10 multi-head attention — frequency 8 multi-head attention in — frequency 1 <- drop !! multi-head attention in three — frequency 1

To capture this drop, it can give us some potential phrases.so I create a library to help this out.

GitHub project – Phraseg

phraseg = Phraseg(''' The goal of reducing sequential computation also forms the foundation of the Extended Neural GPU [16], ByteNet [18] and ConvS2S [9], all of which use convolutional neural networks as basic building block, computing hidden representations in parallel for all input and output positions. In these models, the number of operations required to relate signals from two arbitrary input or output positions grows in the distance between positions, linearly for ConvS2S and logarithmically for ByteNet. This makes it more difficult to learn dependencies between distant positions [12]. In the Transformer this is reduced to a constant number of operations, albeit at the cost of reduced effective resolution due to averaging attention-weighted positions, an effect we counteract with Multi-Head Attention as described in section 3.2. Self-attention, sometimes called intra-attention is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence. Self-attention has been used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment and learning task-independent sentence representations [4, 27, 28, 22]. End-to-end memory networks are based on a recurrent attention mechanism instead of sequence- aligned recurrence and have been shown to perform well on simple-language question answering and language modeling tasks [34]. To the best of our knowledge, however, the Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence- aligned RNNs or convolution. In the following sections, we will describe the Transformer, motivate self-attention and discuss its advantages over models such as [17, 18] and [9]. ''') result = phraseg.extract()

The result will be:

[('the Transformer', 3), ('of the', 2), ('ConvS 2 S', 2), ('input and output', 2), ('output positions', 2), ('number of operations', 2), ('In the', 2), ('attention mechanism', 2), ('to compute', 2)]

Application

we may use this to explore the daily trending of GitHub repo:

https://colab.research.google.com/drive/133uFefx7nMgeuah4FfHZjpqmqfxTyKui

Detail about how it works:

https://medium.com/@voidful.stack/simple-and-effective-phrase-finding-in-multi-language-42264554acb

GitHub project:

https://github.com/voidful/Phraseg

submitted by /u/voidful-stack
[link] [comments]

Intermediate vue.js frontend developer – AJ E-Commerce Technology Ltd. – Markham, ON

Written on December 18, 2019. Posted in Toronto Job Postings.

The end-to-end platform provides deep customer insights that drive real-time shopping experiences across any touch point and any device through its patented… $60,000 – $80,000 a year
From Indeed – Thu, 19 Dec 2019 00:21:21 GMT – View all Markham, ON jobs

[R] Neural networks grown and self-organized by noise (NeurIPS2019)

Written on December 17, 2019. Posted in Reddit MachineLearning.

submitted by /u/hardmaru
[link] [comments]

API Microservice Architect – Next Pathway – Toronto, ON

Written on December 17, 2019. Posted in Toronto Job Postings.

With deep exposure to AI, Machine Learning and Robotic Process Automation, our team members have opportunities to be trailblazers in the technology space.
From Indeed – Wed, 18 Dec 2019 23:23:04 GMT – View all Toronto, ON jobs

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

[D] Debugging model performance discrepancy between offline eval and online exp

PointRend: Image Segmentation as Rendering

[Discussion] Glove Paper Question

[R] Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

[D] Q about “Conv Seq 2 Seq learning” paper

[Discussion] How to estimate conditional probability (cdf) of multivariate dataset?

[P] Simple and effective phrase finding in multi-language?

Dueling with out-of-vocabulary word or phrases is been a problem on nlp, sometime using deep learning cost too much.

Intermediate vue.js frontend developer – AJ E-Commerce Technology Ltd. – Markham, ON

[R] Neural networks grown and self-organized by noise (NeurIPS2019)

API Microservice Architect – Next Pathway – Toronto, ON