Author: torontoai

Thank you for your feedback. I did not try this time.

Written on September 23, 2019. Posted in Susan Li.

Thank you for your feedback. I did not try this time.

[D] Methods to perform unsupervised similarity scoring

Written on September 23, 2019. Posted in Reddit MachineLearning.

I have a task and I don’t know how to tackle this. I received a set of positives and I have to find similar points from a big dataset (that I call basket). I have around 1’000 positives and around 1’000’000 points in the basket. All points are represented with 10 to 15 features. As an output, I would like to have a score for each point of the basket and this score would represent the closeness of the point to the positive set.

I first thought of using a k-nearest neighbours method on the positives but this approach presents two big drawbacks for me. First, I wouldn’t have a score associated to each point of the basket as I would only have a set of close points for each positive. Secondly, and this is the biggest drawback in my opinion, I would have to define the distance in the n-dimensional space myself while I would prefer that the method directly defines weights for each feature on the data (for instance, based on the level of information (variance) contained in each feature).

Does someone could point out to me a good approach to tackle this problem?

Thanks!

submitted by /u/SupervisedHelloWorld
[link] [comments]

rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch

Written on September 23, 2019. Posted in Uncategorized.

Since the advent of deep reinforcement learning for game play in 2013, and
simulated robotic control shortly after, a multitude of new algorithms
have flourished. Most of these are model-free algorithms which can be
categorized into three families: deep Q-learning, policy gradients, and Q-value
policy gradients. Because they rely on different learning paradigms, and
because they address different (but overlapping) control problems,
distinguished by discrete versus continuous action sets, these three families
have developed along separate lines of research. Currently, very few if any
code bases incorporate all three kinds of algorithms, and many of the original
implementations remain unreleased. As a result, practitioners often must
develop from different starting points and potentially learn a new code base
for each algorithm of interest or baseline comparison. RL researchers must
invest time reimplementing algorithms–a valuable individual exercise but one
which incurs redundant effort across the community, or worse, one that presents
a barrier to entry.

Yet these algorithms share a great depth of common reinforcement learning
machinery. We are pleased to share rlpyt, which leverages this commonality
to offer all three algorithm families built on a shared, optimized
infrastructure, in one repository. Available from BAIR at
https://github.com/astooke/rlpyt, it contains modular implementations of
many common deep RL algorithms in Python using Pytorch, a leading deep learning
library. Among numerous existing implementations, rlpyt is a more
comprehensive open-source resource for researchers.

[D] State of Supervised ML

Written on September 23, 2019. Posted in Reddit MachineLearning.

Hi,

I’ve been recently away from the research landscape, so I’m not up to date on what’s going on lately.

I know in 2017 the big hype was still Supervised Learning with bigger and bigger models.

Is there anything that might predict upcoming breakthrough in unsupervised ML or simpler supervised models (few-shot/one-shot learning) ?

What do you thing ML will look like in 5-10 years ?

submitted by /u/swentso
[link] [comments]

[P] Natural Language Processing Roadmap and Keyword for students who are wondering what to study

Written on September 23, 2019. Posted in Reddit MachineLearning.

Hello.

I created summarized Natural Language Processing Roadmap in Github Repository with preparing NLP Engineer Interview to not forgetting which i had learned things. 😀 😀

It’s contain in order Probability and Statistics, Machine Learning, Text Mining, Natural Language Processing.

It was very hard to make tree, sub-tree sctucture of mind map with abstract keywords, so Please focus on KEYWORD in square box, as things to study.

Also You can use the material commercially or freely, but please leave the source.

If you like the project, please ask star, fork and Contribution! 😀 Thanks!!

https://i.redd.it/qradrhttnho31.png

https://i.redd.it/9zdjvaavnho31.png

https://i.redd.it/ah8w7x8wnho31.png

https://i.redd.it/wv0sw8bxnho31.png

https://github.com/graykode/nlp-roadmap

submitted by /u/nlkey2022
[link] [comments]

[R] MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection. A 100gb audio collection containing sounds of malfunctioning mechanical equipment, recorded with an 8-channel microphone array.

Written on September 23, 2019. Posted in Reddit MachineLearning.

submitted by /u/chisai_mikan
[link] [comments]

[P] Does this exist? An AI that reads a bibliography then recommends papers based on those present in the bibliography

Written on September 23, 2019. Posted in Reddit MachineLearning.

I’m working on a paper and am currently conducting a literature review. I’m building up a bibliography file and I would like if there were an AI that could read the bibliography then recommend papers to add to that bibliography based on those it saw. I could easily envision websites such as Web of Science having such a system (they do not) so I’m just wondering if there is such a site.

submitted by /u/NTGuardian
[link] [comments]

[D] RunwayML (good/great/bad/ugly)

Written on September 23, 2019. Posted in Reddit MachineLearning.

I’m involved in facial recognition and was wondering if anyone has tried runwayML?

Any thoughts? what’s your experience been like? would love some honest review here (good/great/bad/ugly)

What tasks/application do you use runwayml for? would love to hear from all

submitted by /u/MLtinkerer
[link] [comments]

[D] Transformer number of token performance limits

Written on September 22, 2019. Posted in Reddit MachineLearning.

Hi,

I am currently working on a research project that involves using a transformer-like model for a NLP task. Specifically summarizing long documents.

I was wondering if any of you knows of a paper that explores the limits of the transformer when using super long sequences.

Is there any issue with long sequences?

Is there a “length” limit that this kind of model starts to decrease their performance?

Thanks a lot in advance!

submitted by /u/fdelrio89
[link] [comments]

[P] CEFR Checker – an AI tool for adapting texts to language learners’ CEFR levels

Written on September 22, 2019. Posted in Reddit MachineLearning.

Hi all, I’m a machine learning engineer at Duolingo, and I wanted to share a project I’ve been working on called CEFR Checker. It aligns text from many languages to a common language learning proficiency standard (the CEFR) using an ordinal regression model trained over multilingual word embeddings and corpus word frequency estimates. We use this tool when we create Duolingo Podcasts and interactive Stories to help adapt the material to the appropriate CEFR level for beginner, intermediate or advanced learners.

We’re sharing it publicly in case others find it useful as well.

The tool is completely free to use and you can check it out here.

Here’s a blog post I wrote about the project if you’d like more info.

submitted by /u/it-is-a-fork
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

Thank you for your feedback. I did not try this time.

[D] Methods to perform unsupervised similarity scoring

rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch

[D] State of Supervised ML

[P] Natural Language Processing Roadmap and Keyword for students who are wondering what to study

[R] MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection. A 100gb audio collection containing sounds of malfunctioning mechanical equipment, recorded with an 8-channel microphone array.

[P] Does this exist? An AI that reads a bibliography then recommends papers based on those present in the bibliography

[D] RunwayML (good/great/bad/ugly)

[D] Transformer number of token performance limits

[P] CEFR Checker – an AI tool for adapting texts to language learners’ CEFR levels