Category: Reddit MachineLearning

[D] Positional Encoding in Transformer

Written on August 21, 2019. Posted in Reddit MachineLearning.

Hey all,

I was reading up the transformer paper https://arxiv.org/abs/1706.03762. This architecture uses positional encoding which the attention layers ignore.

I don’t understand two things –

Why use Sin & Cos as positional embeddings , why not any other function?
They also talk about training these positional embeddings, how do you go about training such embeddings. As in how do you let the model know that these embeddings are for the position

Thanks !

submitted by /u/amil123123
[link] [comments]

[R] Saccader: Improving Accuracy of Hard Attention Models for Vision

Written on August 21, 2019. Posted in Reddit MachineLearning.

submitted by /u/xternalz
[link] [comments]

[R] FSGAN: Subject Agnostic Face Swapping and Reenactment

Written on August 21, 2019. Posted in Reddit MachineLearning.

paper: https://arxiv.org/pdf/1908.05932.pdf

video: https://www.youtube.com/watch?v=BsITEVX6hkE

submitted by /u/PuzzledProgrammer3
[link] [comments]

[D] Is neural architecture search race to beat ImageNet actually relevant anymore

Written on August 21, 2019. Posted in Reddit MachineLearning.

We’ve seen the limit of training a 2D CNN on RGB images, resulting in texture bias, exploiting regularities, etc.

CNN ImageNet is bag-of-features (https://openreview.net/forum?id=SkfMWhAqYQ)
CNN ImageNet actually learns to classify texture instead of learning 3D shapes https://openreview.net/forum?id=Bygh9j09KX
Backprop on CIFAR-10 exploit Surface Statistical Regularities to get good test accuracy https://arxiv.org/abs/1711.11561

Is there any meaning for race to find best neural architecture search (NAS) on ImageNet? We are hitting limits of training with monocular RGB images with unknown arbitrary camera poses and intrinsics (focal length, skew, etc). In the end what we get is powerful monocular texture classifier but easily duped by adversarial attacks.

And the found architecture hyperparams is easily overfit to one dataset. In my experience, using Imagenet EfficientNet-B0 to train CIFAR-10 from scratch (not transfer learning like the official paper), resulting accuracy worse than Resnet.

Is there ongoing work to create pose-aware “3D ImageNet”? The closest I can found is probably ShapeNet and various robotics datasets like Princeton SUN-RGBD. But the scale and domain is too small and narrow.

submitted by /u/tsauri
[link] [comments]

[D] What should I read/do to get into machine learning?

Written on August 21, 2019. Posted in Reddit MachineLearning.

I’m currently in high school looking to get into machine learning and I was wondering what I should read or what courses I should take.

I am planning on taking the MITx Python course and reading the 2nd edition of Aurélian Géron’s book sometime after it is released. I am also currently eyeing Andriy Burkov’s “The Hundred-Page Machine Learning Book”.

submitted by /u/thetylerwolf
[link] [comments]

[D] Most outlandish application of Transformer Architechture

Written on August 21, 2019. Posted in Reddit MachineLearning.

I’m conducting some independent research on the effectiveness of Transformer, Attention, GPT, BERT structure on tasks outside the domain of Language. I was curious to know what the most outlandish implementation you have done may be or the coolest cross-domain application you can think of. Lets see how much we can Transform the Transformer!

submitted by /u/ThatAi_guy
[link] [comments]

[P] A command-line tool that spins-up EC2 instances of any CPU or GPU specification, configures your laptop so you can connect to secure EC2 remotely using VS-Code Remote-SSH and start writing machine-learning algorithms using VS-Code’s interactive Python.

Written on August 21, 2019. Posted in Reddit MachineLearning.

GitHub page: https://github.com/provisionpad/provisionpad

Would like to know your feedback.

submitted by /u/amirzainali
[link] [comments]

[D] Numenta (neurocortical theory group behind the Thousand Brain Theory of Intelligence) is doing an AMA on /r/neuroscience.

Written on August 20, 2019. Posted in Reddit MachineLearning.

Link here.

Joining us is Matt Taylor (/u/rhyolight), who is /u/Numenta‘s community manager. He’ll be answering the bulk of the questions here, and will refer any more advanced neuroscience questions to Jeff Hawkins, Numenta’s Co-Founder.

We are on a mission to figure out how the brain works and enable machine intelligence technology based on brain principles. We’ve made significant progress in understanding the brain, and we believe our research offers opportunities to advance the state of AI and machine learning.

Despite the fact that scientists have amassed an enormous amount of detailed factual knowledge about the brain, how it works is still a profound mystery. We recently published a paper titled A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex that lays out a theoretical framework for understanding what the neocortex does and how it does it. It is commonly believed that the brain recognizes objects by extracting sensory features in a series of processing steps, which is also how today’s deep learning networks work. Our new theory suggests that instead of learning one big model of the world, the neocortex learns thousands of models that operate in parallel. We call this the Thousand Brains Theory of Intelligence.

The Thousand Brains Theory is rich with novel ideas and concepts that can be applied to practical machine learning systems and provides a roadmap for building intelligent systems inspired by the brain. I am excited to be a part of this mission! Ask me anything about our theory, code, or community.

Relevant Links:

Past AMA:
/r/askscience previously hosted Numenta a couple of months ago. Check for further Q&A.

Numenta HTM School:
Series of videos introducing HTM Theory, no background in neuro, math, or CS required.

submitted by /u/blueneuronDOTnet
[link] [comments]

[D] OpenAI’s official 774M GPT-2 model released. 1.5B model might be released, dependent on 4 research organizations.

Written on August 20, 2019. Posted in Reddit MachineLearning.

Here are the links:

https://openai.com/blog/gpt-2-6-month-follow-up/

https://github.com/openai/gpt-2

submitted by /u/permalip
[link] [comments]

[P] Deploy GPT-2 on AWS

Written on August 20, 2019. Posted in Reddit MachineLearning.

I wrote a post about how I deployed OpenAI’s GPT-2 as a web API on my AWS account. I used code from the OpenAI repo to download and export the model and Cortex to run it on AWS.

You can use this command to test out the API:

curl -k -X POST -H "Content-Type: application/json" -d '{"samples":[{"text": "machine learning"}]}' https://aefa719d5c44011e9adc30ea9bac8e9a-1873012518.us-west-2.elb.amazonaws.com/text/generator

submitted by /u/ospillinger
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[D] Positional Encoding in Transformer

[R] Saccader: Improving Accuracy of Hard Attention Models for Vision

[R] FSGAN: Subject Agnostic Face Swapping and Reenactment

[D] Is neural architecture search race to beat ImageNet actually relevant anymore

[D] What should I read/do to get into machine learning?

[D] Most outlandish application of Transformer Architechture

[P] A command-line tool that spins-up EC2 instances of any CPU or GPU specification, configures your laptop so you can connect to secure EC2 remotely using VS-Code Remote-SSH and start writing machine-learning algorithms using VS-Code’s interactive Python.

[D] Numenta (neurocortical theory group behind the Thousand Brain Theory of Intelligence) is doing an AMA on /r/neuroscience.

[D] OpenAI’s official 774M GPT-2 model released. 1.5B model might be released, dependent on 4 research organizations.

[P] Deploy GPT-2 on AWS