Category: Reddit MachineLearning

[D] Is Neuroscience background useful for ML research?

Written on August 18, 2019. Posted in Reddit MachineLearning.

Is there anybody with neuroscience background who moved to ML research? Any ML researchers who deliberately decided to learn some neuroscience to get new ideas?

Would you say it was worth it to learn neuroscience for you? Would you say it would be better to just focus purely on ML?

I am finishing my CS undergrad and deciding between two options for grad school:

1) PhD in Optimization/ML theory

2) MS in Neuroinformatics (and then eventually going for PhD in ML theory)

I am generally interested in learning neuroscience and understanding how the brain works. However, it seems the theory is not quite here yet, and I do not want to work on experimental biological side. I ultimately want to work on ML theory as I think ML has the most impact here and now, while it will likely take some decades until neuroscience is sufficiently developed. That is why I am considering to learn some core neuroscience concepts, and then try to apply those concepts to find novel ideas for ML.

The Neuroinformatics MS program is quite flexible and will allow me to be primarily focused on ML and open-ended research, while 1/3 of my courses will be in neuroscience. I will work on bio-plausible backprop (some references are here) and maybe spiking neural networks. Somewhat unrelatedly, being there may give me some insight into brain-computer interfaces research while there is growing interest in that.

I think doing that MS will give me more diverse background and ideas for further research in ML theory, and open more doors. However, I am somewhat concerned if it is worth it, wouldn’t doing pure ML leave me in in a better position?

Also, I am a bit sceptical about bio-plausible ML research. While it is really interesting, it seems to be a bit of a “toy” problem. We don’t even know if something like backprop happens in the brain, so trying to make it more “bio-plausible” for its own sake is somewhat of an artificial problem.

There was a related discussion: [D] Computational Neuroscience and Machine Learning

submitted by /u/Slayer10101
[link] [comments]

[D] The link between stationary distributions and SDEs

Written on August 17, 2019. Posted in Reddit MachineLearning.

Somewhat old paper, https://arxiv.org/abs/1506.04696. I recently spent some time going over this, and the paper has some great proofs and discussion. I did however,find myself looking at their theorem and thinking “how on earth did they find that form of the drift coefficient”, and found the proof to be mainly about showing if you use that form, the result holds. That’s not too enlightening if you want insight into how they found the result, so I went the other way myself, and in 1D it turns out to be somewhat straight-forward. I wrote it up if anybody else finds this view interesting

https://chrisorm.github.io/SDE-S.html

submitted by /u/chrisorm
[link] [comments]

[D] Thesis Subject Suggestion

Written on August 17, 2019. Posted in Reddit MachineLearning.

I apologize if this post doesn’t quite match this subreddit, but I’m really looking for suggestions from active practitioners and researchers in the field and this looks like a great place to find one!

I’m a computer engineering student, about to begin my last year of bachelor’s degree. As you could imagine, the AI course I had to take was nothing but primitive search algorithms. But with the help of one of my professors, I was able to study ML and DL in particular, using a number of courses and a few books. Right now I’ve one paper almost ready for submission and a novel model on its way (both GAN-related).

Now I’m wondering about the subject of my thesis. I talked to my professor a while ago and he suggested to work on a model that could take in a face and output the same person at some specific age. But I was kinda hoping to work on something more practical which other people could use as well.

I had the idea of creating an application, sorta a graphical interface to Keras, which would allow the user to create the computation graph, make the training loop and normalize the data. After giving some thoughts to this idea, I came to the realization that such a tool can never offer the same range of possibilities a programmer would have with the library itself. Then I thought about targeting the practitioners and focusing more on the mainstream architectures but that sounded hard to finish in just one year, as well. I’m not sure if a beta version would be worth pursuing, with the hope that I could finish it during my masters.

The reason that I’m aiming at projects like this is that I’m hoping to get a scholarship to continue my studies abroad which I won’t be able to afford on my own in any way (I’m Iranian and if you check the value of our currency, you’d realize why). I’m hoping to prove my capabilities in my thesis project. My professor says that if I get even a single paper published, that’s enough for a bachelor student, but… you can never be too sure!

Another idea that I’m not exactly sure is worth the effort or not is working on some sort of a knowledge base for ML. With the every growing amount of publications in this field, it’s getting harder and harder to dive through them, specifically when it comes to comparing models. Now imagine a website with a taxonomy system that allows for a tree-like categorizing system which researchers could introduce their models in and others add their own experiences, implementations, etc. Unfortunately, I’d only be able to implement the website and absolutely no where near knowledgeable enough to populate it…

I’d greatly appreciate it if you could guide me towards a some type of a project that would be an impressive point in my resume to get an scholarship.

submitted by /u/mfarahmand98
[link] [comments]

[P] Pytorch Implementation of GANimation with pretrained weights.

Written on August 17, 2019. Posted in Reddit MachineLearning.

Hi all!

TL;TR I shared on GitHub an implementation of a Conditional GAN model called GANimation. I include pretrained weights and a preprocessed dataset as well.

A few months ago I became really interested in a project called GANimation. The authors (Pumarola et al.) of this project were able to train a Conditional GAN model capable of modifying facial expressions in a continuous way. This sounded really interesting to me and I wanted to play with the model, but when I tried the author’s implementation in my computer I had problems with the training process and I didn’t find any pre-trained weights. As at that moment I also wanted to learn PyTorch, I decided to create my own implementation. As this project was really similar to StarGAN I started cloning their repo and I used it as baseline.

In this implementation I provide pretrained models and a preprocessed dataset to facilitate the use of this model. I also included the functions to create the following video.

Applying the expression of the face in the first column to each image in the top row.

Although there’s a lot to improve and clean in the code, I hope it can be useful for anyone that wants to use this model.

submitted by /u/viccpopa
[link] [comments]

[R] Crash Overview on Keras Software Architecture Basics

Written on August 17, 2019. Posted in Reddit MachineLearning.

This is one of a series of small snippet-tutorials to share with you my experiences with APIs hacking and Tensorflow hacker.An effort to contribute to the community of Deep Learning software developers grow, from the basis of how to deal with Open Source Software (it obviously depends on language you are working, this time python).

https://uiuran.github.io/keras/tensorflow/deeplearning/2019/08/18/Keras-Deep-Learning-High-Level-API-Dismistified.html

submitted by /u/penalvad00
[link] [comments]

[D] What are the performance metrices of word embedding model?

Written on August 17, 2019. Posted in Reddit MachineLearning.

Basically I am trying to find similarity of sentences in my native language. I created word embeddings using word2vec and fasttext. And used word mover distance, to find similar text. Now how do I determine the performance of each model. P.S there is no dataset for similar text in my native language.

submitted by /u/uname_uknown
[link] [comments]

[P] I created a Transformer Model package in Tensorflow 2.0 that is extensible and can be used to rebuild GPT-2, BERT, and XLNet.

Written on August 17, 2019. Posted in Reddit MachineLearning.

Hi everyone,

https://pypi.org/project/transformer-model/

pip install transformer-model

I recently took some time to build out an extensible Transformer Model in TF2, mostly for my own future use cases but I thought I’d share with you and possibly get some feedback as well. I have not created many python packages, so if there’s something I missed or seems out of place feel free to create an issue on the repo.

The goal of this project was to create all of the core pieces of the Transformer Model discussed in the “Attention is all you need” paper in a way that I could reuse them to create newer, more SOTA models like BERT and XLNet. I’ve left instructions on how to use this package to train a Transformer model and will be packaging this to go on pypi later today.

My hope is this package saves someone some dev time. If it does, please give the package a star!

submitted by /u/dfcHeadChair
[link] [comments]

[p] Generating TechCrunch start-up news with GPT2

Written on August 17, 2019. Posted in Reddit MachineLearning.

Hello.

It all started at a startup summit I visited. There I thought of teaching an AI to generate start-up ideas.

I couldn’t find data, so I decided to make it generate start-up news from TechCrunch.

I decided this would be a good time to test GPT-2, which generated believable outputs.

I trained this network using Google Colab, since my ML machine has no GPU and a bad CPU.

Anyway, here are the results!

Prompt: “funding”

s’ 26/“Fitz Frames is planning on giving away pairs of goggles as an homage to the brand, 27/ Stella Brands has revealed it has had an ownership stake in Fitz Frames since 2012 Rex 28/ with Marco Benevento is the birth name of the latest wave of millennial investors Y Combinator 39/ Star Wars Celebration Star Wars Celebration Fitz Frames is currently active on Y Combinator’s fundraising page, raising $15 million in seed funding and several more round funding rounds. Its fourth round is set to be its last, in early 2017, bringing its valuation to between $50 and $100 million, as it attempts to resuscitate the increasingly popular glasses. Fitz Frames hopes to become a more durable, lightweight option for travel, and is pursuing businesses that share its technology with bring the option of attaching sunglasses to people’s feet. It stems from a game where two friends play jack-of-all-trades, attempting to become the first to have their glasses replaced. While other players in the jack-of-all-trades game have tried to make it to the final round, in which the winner is determined by best results from the next round. Players toss duct tape over their faces to make room for new glasses, and drop them somewhere cool for future decorating. The game is currently in its fourth season, and said fourth round will be its biggest to date, with all-new glasses added by the day. A new website is hoping to get around this by giving players a more standardized format for making their purchases. It’s called Lux, and it’s set to’ inspiration’be”ite consumers. “The way we think about technology is related to 1) what we are able to buy and 2) what we are able to want from the experience. If you have me walk you through the steps of making a new pair of glasses, I promise you that that’s it.” Making new glasses is kind of like making your first pair of shoes, but you can’t replace the pair you got from the size chart on the website. 2) That’s the problem.” You can probably make the case for creating greater accessibility in the least,” Kiss 101 founder and CEO Gabrielle Esposito tells me. “There are tons of opportunities out there. From helping you find a bus stop to helping make your child’s doctor

Prompt: “virtual reality”

“From a physical standpoint I think the physical world is actually kind of the only guide”sarasin” about’ 14  ” with “little’� to none guidances,” added Binks. Most of us, by now, are aware that virtual reality (VR) can be a bit heavy for a country like India, where the current model focuses on physically reaching and leaving virtual walls. The problem is that there isn’t much physical speed at which a user can experience a place ‘‘.” That “b’alls of a city are a walk in the park right now, but they often aren’t possible in South Asia, where walking and running are the norm. Sri Lanka is an alternate model for India, focusing on smaller, backpack-accessible cities that don’t rely on long lines to get by, and are often hard for Asian cities like Japan and the U.S. to get their cities out of the city. In’hibited by the physical and economic advantages of the Indian subcontinent, Binks is’t seeking funding — and in this day and age of artificial intelligence and artificial deadlines, it’s crucial that governments prioritize proven methods of achieving long-term solutions. “We need to be able to rely on the cards and the the thesles,’ pondering pal Jain Cheung, co-founder and CEO of DeepMind. “We need to be able to rely on our bank and have it tell us what’s happening online,’ Cheung said. In addition to launching treating surroundings as if they were holograms, the founders’ latest approach is also use virtual reality as an alternative to the human brain for assessing long-distance travel. For now, though, the focus is on using virtual reality to help make smarter batteries — and connect-and-disconnect sensors — that our bodies have built into our’ brains. “The amount of power we have is going to have an effect on the way we do it.” said FitzGerald. “But from a health and security perspective,” said Cheung. The goal is for the technology to be applied across all healthcare services,” said Feridranco. “Any healthcare organization would like to have accurate healthcare data,” he said. “The problem is that

[D] I created forecasting model to forecast cryptocurrency using sentiment data, and this is the result.

Written on August 17, 2019. Posted in Reddit MachineLearning.

dataset and code can get from here, https://github.com/huseinzol05/Stock-Prediction-Models/blob/master/deep-learning/sentiment-consensus.ipynb

How we gather the data, provided by Bitcurate, bitcurate.com

Because I don’t have sentiment data related to stock market, so I will use cryptocurrency data, BTC/USDT from binance.

close data came from CCXT, https://github.com/ccxt/ccxt, an open source cryptocurrency aggregator.
We gather from streaming twitter, crawling hardcoded cryptocurrency telegram groups and Reddit. And we store in Elasticsearch as a single index. We trained 1/4 layers BERT MULTILANGUAGE (200MB-ish, originally 700MB-ish) released by Google on most-possible-found sentiment data on the internet, leveraging sentiment on multilanguages, eg, english, korea, japan. Actually, it is very hard to found negative sentiment related to bitcoin / btc in large volume.

And the we use elasticsearch-dsl, https://elasticsearch-dsl.readthedocs.io/, to query,

s = s.filter( 'query_string', default_field = 'text', query = 'bitcoin OR btc', )

We only do text query only contain bitcoin or btc.

Consensus introduction

We have 2 questions here when saying about consensus, what happened,

to future price if we assumed future sentiment is really positive, near to 1.0 . Eg, suddenly China want to adapt cryptocurrency and that can cause huge requested volumes.
to future price if we assumed future sentiment is really negative, near to 1.0 . Eg, suddenly hackers broke binance or any exchanges, or any news that caused wreck by negative sentiment.

So, we use deep-learning to simulate for us! I use CNN-Seq2Seq architecture this time, not required to bring last memory last RNN and fast to train.

Step

We pulled last 100 hours data and aggregated every 20 minutes, Split the dataset to train and test. Test size is last 10 hours (30 datapoints, 3 * 10), and early remaining use to train.
Initiate the model and train the model by 200 epochs. learning_rate is very sensitive, I found 1e-3 is perfect. Here I never tried to do hyperparameters searching.

Result

https://raw.githubusercontent.com/huseinzol05/Stock-Prediction-Models/master/output/sentiment-consensus.png

Discussion

The model learn, if positive and negative sentiments increasing, both will increase the price. That is why, using positive consensus or negative consensus caused price going up.
Volatility of price is higher if negative sentiment is higher, still positive volatility.
Momentum of price is higher if negative sentiment is higher, still positive momentum.
Even predicted trends are far from actual test trend, for me, it quite fascinating because I can simulate the models by N times to get different variances and from here I can calculate VaR, potential volatilities and momentums, trading ratios and etc. Well, if forecasted trends follow really close with actual test trend, do not believe it too much, there is no such model able to simulate stochastic trend that depends on a lot of real world parameters.

Any comment or feedback?

submitted by /u/huseinzol05
[link] [comments]

[D] 1080ti vs 2080?

Written on August 17, 2019. Posted in Reddit MachineLearning.

2080 has tensor core, and can do 16bits caculate. 1080ti has 11gb ram

2080 is slightly cheaper

So which should I get? I’m wondering that is 16bit calculation means twice capacity?

Thanks

submitted by /u/pg13mvp
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[D] Is Neuroscience background useful for ML research?

[D] The link between stationary distributions and SDEs

[D] Thesis Subject Suggestion

[P] Pytorch Implementation of GANimation with pretrained weights.

[R] Crash Overview on Keras Software Architecture Basics

[D] What are the performance metrices of word embedding model?

[P] I created a Transformer Model package in Tensorflow 2.0 that is extensible and can be used to rebuild GPT-2, BERT, and XLNet.

[p] Generating TechCrunch start-up news with GPT2

[D] I created forecasting model to forecast cryptocurrency using sentiment data, and this is the result.

How we gather the data, provided by Bitcurate, bitcurate.com

Consensus introduction

Step

Result

Discussion

[D] 1080ti vs 2080?