Category: Reddit MachineLearning

[D] Symmetry-equivalent representations

Written on August 19, 2019. Posted in Reddit MachineLearning.

I’m training regression model, learning the mapping from integer-valued vectors to a single real-valued property. All cyclic permutations of my feature vectors are equivalent, that is they have the same y. I’m a bit lost in trying encode this.

One idea I had was to augment the dataset by generating all the cyclic permutations, but I don’t think this is a good way to go at all. I’ve stumbled on strategies to encode cyclic features such as months by mapping them to a periodic function, but in my case this wouldn’t work as the elements of my vector have a different meaning.

submitted by /u/throwervek
[link] [comments]

[Discussion] Is Sagemaker just a glorified EC2 instance?

Written on August 19, 2019. Posted in Reddit MachineLearning.

I’m data scientist with a lot of model and math knowledge, and experience with mostly on-prem tools and some GCP. I’m trying to pick up more cloud skills. As I’m experimenting more with Sagemaker, I can figure out how it is more than just an EC2 instance with the right libraries installed. Is there anything more to it? What am I missing?

submitted by /u/AlexSnakeKing
[link] [comments]

[R] Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow

Written on August 19, 2019. Posted in Reddit MachineLearning.

Want to convert your video to slowmotion?
https://github.com/MortenHannemose/pytorch-vfi-cft

submitted by /u/mohanne
[link] [comments]

[D] Why isn’t bayesian inference using Gibbs Sampling / MCMC / HMC done on GPUs?

Written on August 19, 2019. Posted in Reddit MachineLearning.

I’ve seen Multibugs which claims to achieve impressive speedups by exploiting multicore but for the most part, i’ve not seen any of the existing Bayesian Inference leverage the GPU. Does anyone know why or why not?

submitted by /u/sicp4lyfe
[link] [comments]

[D] Which SOTA authorship attribution / text classification model to use?

Written on August 19, 2019. Posted in Reddit MachineLearning.

I’m currently doing research for my thesis project, and was wondering which models to experiment with. I have a large dataset of political speeches (around 180.000) annotated with the respective party (10 parties total), and would like a model to learn to classify each party given the speeches.

My question is, which model is currently best for this type of task? I have some experience with Bi-LSTM models, and also CNN with LSTM – however I’m very interested if other models would perform better at this task, or if you any experience with the architecture of these type of models?

submitted by /u/mikkelmedm
[link] [comments]

Rapid large-scale fractional differencing to minimize memory loss while making a time series stationary. 6x-400x speed up over CPU implementation.

Written on August 19, 2019. Posted in Reddit MachineLearning.

Happy to launch GFD: GPU-accelerated Fractional Differencing. A substantial 6x-400x speed-up for single GPU RAPIDS cuDF implementation over NumPy/Pandas CPU-implementation.

Feel free to play with the code on Google Colab, run it on GCP/AWS or your local machine with the entirely self-contained notebook.

Summary

Typically we attempt to achieve some form of stationarity via a transformation on our time series through common methods including integer differencing. However, integer differencing unnecessarily removes too much memory to achieve stationarity. An alternative, fractional differencing, allows us to achieve stationarity while maintaining the maximum amount of memory compared to integer differencing. While existing CPU-based implementations are inefficient for running fractional differencing on many large-scale time series, our GPU-based implementation enables rapid fractional differencing of up to 400x faster on a single machine.

Code

https://github.com/ritchieng/fractional_differencing_gpu

Presentation

https://www.researchgate.net/publication/335159299_GFD_GPU_Fractional_Differencing_for_Rapid_Large-scale_Stationarizing_of_Time_Series_Data_while_Minimizing_Memory_Loss

submitted by /u/ritchieng
[link] [comments]

[N] Trump falsely claims Google ‘manipulated’ millions of 2016 votes

Written on August 19, 2019. Posted in Reddit MachineLearning.

https://www.cnn.com/2019/08/19/politics/trump-google-manipulated-votes-claim/index.html

The referenced article: https://aibrt.org/downloads/EPSTEIN_et_al_2017-SUMMARY-A_Method_for_Detecting_Bias_in_Search_Rankings-EMBARGOED_until_March_14_2017.pdf

Key point from the article referenced by CNN’s story: Was the bias the same for all search engines? No. The level of pro-Clinton bias we found on Google (0.19) was more than twice as high as the level of pro-Clinton bias we found on Yahoo (0.09).

Among other issues, one thing that CNN did not mention is the presumption that Google is wrong, Yahoo correct, given that there is no ground truth to compare to. Perhaps there were more pro-Clinton articles and news appearing those days. And more generally, I might guess that Yahoo’s and Google’s engines are simply different algorithms showing different things.

Before someone complains: yes, pagerank was considered “machine learning”, though not deep learning of course. Though it feels more like graph theory to me.

submitted by /u/errorsignal
[link] [comments]

[D] Is here something advanced for solving tabular data classification by neuro nets?

Written on August 18, 2019. Posted in Reddit MachineLearning.

I found only embeddings for categorical features.

Where are so many nets for pictures, but looks like for tabular data peoples just stack dense layers with random hyperparameters.

Am I miss something?

submitted by /u/hadaev
[link] [comments]

[D] “Inverse Design” to create new optical chip components

Written on August 18, 2019. Posted in Reddit MachineLearning.

I hope discussions of ML applications is OK in this sub. I came across this article recently about researchers in the field of photonics, which doesn’t have a lot of analytical equations to calculate performance by hand, using some basic ML techniques to create high performance components for photonic integrated circuits. They start with a black box, feed in the desired output performance, and then use basic electromagnetic boundary conditions and ML to work backward to what would be required to get there. They call this “inverse design”.

This paper goes into it a little more and shows an example of the result of the technique: https://arxiv.org/pdf/1504.00095.pdf

submitted by /u/gburdell
[link] [comments]

[D] All papers claim that their NLP solutions exceed human comprehension. Is it true? Have we solved NLP?

Written on August 18, 2019. Posted in Reddit MachineLearning.

Every month I’m stumbling on at least a few papers with NLP models described as a state-of-the-art close to or even over human baseline. It seems like humanity should hail neural-net-overlords and retreat from any language-related work..

submitted by /u/rafgro
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[D] Symmetry-equivalent representations

[Discussion] Is Sagemaker just a glorified EC2 instance?

[R] Video Frame Interpolation via Cyclic Fine-Tuning and Asymmetric Reverse Flow

[D] Why isn’t bayesian inference using Gibbs Sampling / MCMC / HMC done on GPUs?

[D] Which SOTA authorship attribution / text classification model to use?

Rapid large-scale fractional differencing to minimize memory loss while making a time series stationary. 6x-400x speed up over CPU implementation.

[N] Trump falsely claims Google ‘manipulated’ millions of 2016 votes

[D] Is here something advanced for solving tabular data classification by neuro nets?

[D] “Inverse Design” to create new optical chip components

[D] All papers claim that their NLP solutions exceed human comprehension. Is it true? Have we solved NLP?