Category: Reddit MachineLearning

[R] AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

Written on September 12, 2019. Posted in Reddit MachineLearning.

When looking at RL training, it’s often frustrating to see the agent taking so long to discover simple things you could code up yourself for parts of the task. This work takes that idea as it basis – if you code up some solutions to parts of the problem, how do you incorporate that into RL training? Turns out it’s a little tricky…

Arxiv: https://arxiv.org/abs/1909.04121 (CORL 2019)

Blog post: http://ai.stanford.edu/blog/acteach/

(hope posting own papers is kosher, open to answering any questions!)

submitted by /u/regalalgorithm
[link] [comments]

[P] RNNs and Reinforcement learning

Written on September 12, 2019. Posted in Reddit MachineLearning.

Keep in mind that I am still in the early phase of learning ML. I cannot disclose the exact task/problem I am working on (not related to NLP), but the below task captures the essence of it.

(REINFORCEMENT LEARNING)

Input – A paragraph written in English.

Output – On a scale of 1-10 (continuous/not discrete scale) predict the level of English of each sentence. For example, My English is poor should score better than I have bad English. Even though both are grammatically correct.

Example input: Hey! how are you? It has been so long since I last saw you.

Example output: [5.544554, 5.890909] (made up numbers)

My approach:

Encode each word. (fixed length if that matters)
Break paragraph into sentences, because prediction for a sentence will not depend upon other sentences.
Pad every sentence so that they have same length.
For every sentence:

i. Pass each word of the sentence to a RNN encoder, And extract the hidden state corresponding to the last word (before padding). (for example: Sentence: i am fine padded_word padded_word, RNN output: [A,B,C,D,E] so I extract RNN output/hidden_state C. (not sure if this is the right thing to do)

j. Pass this hidden state C to RNN decoder, which makes the prediction. This prediction leads to a reward.

Use PPO (proximal policy optimization).

I hope this is clear and am sorry for being so vague about my problem. If it matters, I have a few fully connected layers between encoder and decoder.

So, is this the best approach for this problem?

Does PPO works well with RNNs?

Also, what might be the reason that the network is not learning even when I am using normalized environment?

Any help would be highly appreciated.

submitted by /u/xicor7017
[link] [comments]

[D] looking for ML theory researchers

Written on September 12, 2019. Posted in Reddit MachineLearning.

hello, I’m looking for researchers who work on the theoretical side of ML (learning theory, privacy, architecture search and model compression, etc) to speak in remote spotlight sessions. For context, we have been creating a large repository of ML papers turned into in-depth discussion videos (YouTube channel, website), and recently started inviting paper authors to speak about their work remotely. This would be a great contribution to the ML community, but also a good way to get exposure for your work.

If interested, please email us ([events@ai.science](mailto:events@ai.science)), DM here, comment on this, idk make some sort of noise so that I know you’re out there, and let’s talk.

commitment:

prepare a 20-30 mins talk about one of your papars
spend 20-30 mins in Q&A with the session moderator and the audience (through live chat read to you by the moderator)
hop on a video call with us, share your screen, bam!

submitted by /u/tdls_to
[link] [comments]

[D] Does anyone know of an example of model for translating acronyms?

Written on September 12, 2019. Posted in Reddit MachineLearning.

I have a huge corpus of documents that are filled with acronyms. It is mostly government stuff. Currently we use regex to translates, but the regex performs poorly and requires a lot of manual fixing. I haven’t been able to google this question (it just brings up lists of machine learning/deep learning acronyms).

submitted by /u/Secret_Identity_
[link] [comments]

[R] Evolution of representations in the Transformer

Written on September 12, 2019. Posted in Reddit MachineLearning.

A paper that explains what representations do transformer models actually learn for machine translation, language modelling and BERT-like training.

Arxiv: https://arxiv.org/abs/1909.01380 (EMNLP19, E. Voita et al.)

Blog post: https://lena-voita.github.io/posts/emnlp19_evolution.html

submitted by /u/justheuristic
[link] [comments]

Developing a Real-Time Gun Detection Classifier

Written on September 12, 2019. Posted in Reddit MachineLearning.

submitted by /u/Iam_nameless
[link] [comments]

[D] Do people use meta learning in production?

Written on September 12, 2019. Posted in Reddit MachineLearning.

Finetuning from pretrained models seems way easier and straightforward.

submitted by /u/tsauri
[link] [comments]

[D] Quantum search applicability to machine learning

Written on September 12, 2019. Posted in Reddit MachineLearning.

I was just reading about a hypothesis that quantum search might be common in nature. If it is leveraged in the brain, perhaps that might increase the biological plausibility of Hinton’s capsules. In particular I’m wondering if EM routing between capsules might be implemented as quantum search.

Another possibility is a quantum search encompassing multiple layers of neurons to find weights that minimize the cost function, without the need for back-propagation. But unless nature has some secret for maintaining quantum coherence over large structures, it seems more likely that quantum search would operate more locally.

submitted by /u/trenobus
[link] [comments]

[P] production with Linux or Docker?

Written on September 12, 2019. Posted in Reddit MachineLearning.

I tried to deploy my Keras model to server with flask. I try to track the memory on my local.

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[R] AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

[P] RNNs and Reinforcement learning

[D] looking for ML theory researchers

[D] Does anyone know of an example of model for translating acronyms?

[R] Evolution of representations in the Transformer

Developing a Real-Time Gun Detection Classifier

[D] Do people use meta learning in production?

[D] Quantum search applicability to machine learning

[P] production with Linux or Docker?

[R] Learning Single Camera Depth Estimation using Dual-Pixels