Category: Reddit MachineLearning

[P] Sentence similarity using siamese LSTM

Written on November 17, 2019. Posted in Reddit MachineLearning.

So I have a project where I find the semantic sentence similarity between a dataset of two sentences. For the dataset, I use STS-Benchmark.

First, I used English Wikipedia dump to create a word2vec matrix. Then I used the function text_to_sequence to convert my sentences to an array.

I developped a siamese LSTM, but my problem is that the validation accuracy never increase. I’m stuck at an accuracy of 0.25 to 0.30. When I use spearman’s correlation, I get a value of 45%.

Here is my code: https://pastebin.com/jPaZbDDM

submitted by /u/momo11arsenal
[link] [comments]

[D] So where are we at on the Hype Cycle right now? Before or after the trough of disillusionment?

Written on November 17, 2019. Posted in Reddit MachineLearning.

You never hear about ML or “AI” in the mainstream any more, not since about 2017. AlphaGo and self-driving was exciting, then the self-driving started to struggle and people twigged that the self-play approach cannot be translated to applications where you don’t have a ‘game’ i.e. a perfect model of the environment’s response to actions. The only things that make the media now are garbage from OpenAI like “We PuBliSheD a BoOk written by our garbled-text generator”. So I just wonder what people think about whether we are on the way down to a crash, or just past it and quietly plateauing in productivity, this time around the “AI” hype cycle?

submitted by /u/carrolldunham
[link] [comments]

[D] Best resources to learn about Anomaly Detection on Big Datasets?

Written on November 16, 2019. Posted in Reddit MachineLearning.

What are best books, university courses or mooc to learn how to detect outliers ? Preferably methods that are applicable to Big Data Ecosystem .

Thank you in advance.

submitted by /u/spq
[link] [comments]

[N] Microsoft Incorporates Graphcore AI Chips in Azure Cloud

Written on November 16, 2019. Posted in Reddit MachineLearning.

Graphcore’s AI accelerator chip, the Colossus intelligence processing unit (IPU) is now available for customers to use as part of Microsoft’s Azure cloud platform.

This is the first time any major cloud service provider has publicly offered customers the opportunity to run their data on an accelerator from any of the dozens of AI chip startups and as such, it represents a big win for Graphcore. Microsoft has said access will initially be prioritised for customers who are “pushing the boundaries of machine learning”.

Microsoft and Graphcore have been working together for two years to develop cloud systems and build enhanced vision and natural language processing models for the Graphcore IPU. In particular, the natural language processing (NLP) model, Google’s BERT (bidirectional encoder representations from transformers), which is currently very popular with search engines, including Google themselves.

Using eight Graphcore IPU processor cards (each with a pair of Colossus accelerators), BERT can be trained in 56 hours, similar to the result for GPU with PyTorch, though it is faster than the GPU with TensorFlow (see graph below). Graphcore says customers are seeing BERT inference throughput increase threefold, with 20% improvement in latency.

Given the level of hype surrounding Graphcore — the company is valued at $1.7 billion — these performance improvements seem rather modest. It remains to be seen whether the promised improvement is enough to tempt customers into optimising their models for the IPU.

Advanced models
At the same time, Graphcore has also released some results on more advanced models, where it showed more dramatic performance improvements.

Inference on image processing model ResNext was accelerated 3.4x in terms of throughput at 18x lower latency, compared to a GPU solution consuming the same amount of power. ResNext uses a technique called group separable convolutions, which splits convolution filters into smaller separable blocks to increase accuracy while reducing the parameter count. This approach is well-suited to the IPU, Graphcore says, because of the chip’s massively parallel processor architecture and more flexible, high-throughput memory; smaller blocks of data can be mapped to thousands of fully independent processing threads.

Graphcore also showed good results for Markov Chain Monte Carlo (MCMC)-based models, a new type of probabilistic algorithm which is used for modelling financial markets. This type of model has been out of reach for many in the finance industry, as it was previously considered too computationally expensive to use, said Graphcore. Early access IPU customers in the finance sector have been able to train their proprietary, optimised MCMC models in 4.5 minutes on IPUs, compared to over 2 hours with their existing hardware, a 26x speed up in training time.

Reinforcement learning (RL), another popular technique in modern AI algorithm development, can also be accelerated compared to typical existing solutions. Graphcore cited a factor of ten improvement in throughput for RL models, even before they are optimised for the IPU.

https://www.eetimes.com/document.asp?doc_id=1335297#

submitted by /u/downtownslim
[link] [comments]

[D] Machine Learning – WAYR (What Are You Reading) – Week 75

Written on November 16, 2019. Posted in Reddit MachineLearning.

This is a place to share machine learning research papers, journals, and articles that you’re reading this week. If it relates to what you’re researching, by all means elaborate and give us your insight, otherwise it could just be an interesting paper you’ve read.

Please try to provide some insight from your understanding and please don’t post things which are present in wiki.

Preferably you should link the arxiv page (not the PDF, you can easily access the PDF from the summary page but not the other way around) or any other pertinent links.

Previous weeks :

1-10	11-20	21-30	31-40	41-50	51-60	61-70	71-80
Week 1	Week 11	Week 21	Week 31	Week 41	Week 51	Week 61	Week 71
Week 2	Week 12	Week 22	Week 32	Week 42	Week 52	Week 62	Week 72
Week 3	Week 13	Week 23	Week 33	Week 43	Week 53	Week 63	Week 73
Week 4	Week 14	Week 24	Week 34	Week 44	Week 54	Week 64	Week 74
Week 5	Week 15	Week 25	Week 35	Week 45	Week 55	Week 65
Week 6	Week 16	Week 26	Week 36	Week 46	Week 56	Week 66
Week 7	Week 17	Week 27	Week 37	Week 47	Week 57	Week 67
Week 8	Week 18	Week 28	Week 38	Week 48	Week 58	Week 68
Week 9	Week 19	Week 29	Week 39	Week 49	Week 59	Week 69
Week 10	Week 20	Week 30	Week 40	Week 50	Week 60	Week 70

Most upvoted papers two weeks ago:

/u/adventuringraw: original TrueSkill paper from Microsoft

/u/Grimm___: http://proceedings.mlr.press/v67/gutierrez17a/gutierrez17a.pdf

Besides that, there are no rules, have fun.

submitted by /u/ML_WAYR_bot
[link] [comments]

[D] Is the inception architecture/block a failure?

Written on November 16, 2019. Posted in Reddit MachineLearning.

While we see many direct uses of ResNet blocks and variations of the ResNet architecture being applied everywhere, but I never see anything similar with the Inception blocks, do you guys have any examples of it being use? Why is it not more used?

submitted by /u/TheAlgorithmist99
[link] [comments]

[D] Transfer Learning for Survival Models

Written on November 16, 2019. Posted in Reddit MachineLearning.

Survival models are similar to linear regression models and in this case I am using a AFT survival model. I have trained the model on one dataset and I intend to use this model to predict time to failure for another dataset. I would like to discuss on the criteria that is needed for the transfer to happen as in how the model transfer can be done and if there are approaches I can consider for this purpose. Thanks.

submitted by /u/stat_leaf
[link] [comments]

[R] Neural Network Processing Neural Networks

Written on November 16, 2019. Posted in Reddit MachineLearning.

I would like to share some research I have been working on my spare time:

https://arxiv.org/abs/1911.05640

It is about another type of neural networks which take neural networks as inputs and/or produce them as outputs which seem to be doing well especially on search problems according to my own experiments. I would be really grateful if anyone could provide some feedback.

submitted by /u/firat_tuna
[link] [comments]

[D] Progress bar for Scikit Learn / Sklearn?

Written on November 16, 2019. Posted in Reddit MachineLearning.

Hi! I was wondering if there’s a way to get a progress bar or some form of indication of how far along a model is when using Scikit learn? I’ve tried using verbose=True, but it doesn’t seem to do much and some models don’t allow it.

Thanks!

submitted by /u/saint—-
[link] [comments]

[D] Statistical/ML analysis of intention + wordnets, phrasenets

Written on November 16, 2019. Posted in Reddit MachineLearning.

I’m having a mental struggle right now trying to understand how I would go about programming this, and I’m not even sure it’s feasible.

The problem

Let’s say we’re analyzing song lyrics. Let’s say that hypothetically, whenever the word “darkness” is mentioned in a lyric, there is a 23% chance that the word “night” is also mentioned and a 14% chance that the word “doubt” is also in the lyric.

A second and more complex relationship would be that of phrases. We could imagine that whenever the word “darkness” is mentioned, there is a 3.2% chance that the phrase “I’m scared” is somewhere in the lyric and 0.9% chance that the phrase “going to die” is also there.

A third addition to the complexity would be to add sentiment analysis with a machine learning version of a wordnet that analyzes not only the related words but the related moods.

A fourth addition to the complexity would see morphosyntactical analysis. “I’m scared” is not a feasible assumption as there are many possible subjects in a “scared” sentence, but it would be more feasible for it to be frequent if we said “noun + [to be, present tense] + scared”. This would cover “I’m scared”, “he’s scared”, “we’re scared”, “my son is scared”, etc. And then we could add adverbs and sentence changes (‘our family is, therefore, exceptionally scared’).

The bad way

My current thoughts about it come from traditional programming where for that analysis to occur, we would grab a reference word, grab the rest of the corpus words and count each of ocurrence of each corpus word, then throw all of those counts into an array belonging to the reference word we were analyzing for, and then do that for every word in a text. That would be insanely expensive and would get nowhere.

The ideal but unknown way

A cheaper way to do this would be with an AI + a vectorial or matrix datatype. I’ve been exploring the kinds of AI’s that there are but I’m very new to this and don’t know which one is more appropriate and which analysis algorithm would be best. I’m not even sure if it can be done with our current technology in this exact way, or whether there would be differences in the results I described. Perhaps AI would not be as accurate statistically but would instead rate analytically with a 0-100 not the statistical tendency but the “feel” it gets for how “similar” one word is to another due to their common context. How accurate would this be statistically?

I’ve been pumped recently with BERT, but I’m not experienced enough to create my own conclusions on the topic.

How feasible do you think this would be?
What are your thoughts about the necessary implications and existing ways to approach them?
What similar projects are there being developed right now that you know?
How would someone interested in this go into learning more about this specifically without much experience in machine learning in general?

submitted by /u/Live_Think_Diagnosis
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[P] Sentence similarity using siamese LSTM

[D] So where are we at on the Hype Cycle right now? Before or after the trough of disillusionment?

[D] Best resources to learn about Anomaly Detection on Big Datasets?

[N] Microsoft Incorporates Graphcore AI Chips in Azure Cloud

[D] Machine Learning – WAYR (What Are You Reading) – Week 75

[D] Is the inception architecture/block a failure?

[D] Transfer Learning for Survival Models

[R] Neural Network Processing Neural Networks

[D] Progress bar for Scikit Learn / Sklearn?

[D] Statistical/ML analysis of intention + wordnets, phrasenets

The problem

The bad way

The ideal but unknown way