Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] Positional Encoding in Transformer

Hey all,

I was reading up the transformer paper https://arxiv.org/abs/1706.03762. This architecture uses positional encoding which the attention layers ignore.

I don’t understand two things –

  1. Why use Sin & Cos as positional embeddings , why not any other function?
  2. They also talk about training these positional embeddings, how do you go about training such embeddings. As in how do you let the model know that these embeddings are for the position

Thanks !

submitted by /u/amil123123
[link] [comments]

[D] Is neural architecture search race to beat ImageNet actually relevant anymore

We’ve seen the limit of training a 2D CNN on RGB images, resulting in texture bias, exploiting regularities, etc.

  1. CNN ImageNet is bag-of-features (https://openreview.net/forum?id=SkfMWhAqYQ)
  2. CNN ImageNet actually learns to classify texture instead of learning 3D shapes https://openreview.net/forum?id=Bygh9j09KX
  3. Backprop on CIFAR-10 exploit Surface Statistical Regularities to get good test accuracy https://arxiv.org/abs/1711.11561

Is there any meaning for race to find best neural architecture search (NAS) on ImageNet? We are hitting limits of training with monocular RGB images with unknown arbitrary camera poses and intrinsics (focal length, skew, etc). In the end what we get is powerful monocular texture classifier but easily duped by adversarial attacks.

And the found architecture hyperparams is easily overfit to one dataset. In my experience, using Imagenet EfficientNet-B0 to train CIFAR-10 from scratch (not transfer learning like the official paper), resulting accuracy worse than Resnet.

Is there ongoing work to create pose-aware “3D ImageNet”? The closest I can found is probably ShapeNet and various robotics datasets like Princeton SUN-RGBD. But the scale and domain is too small and narrow.

submitted by /u/tsauri
[link] [comments]

[D] What should I read/do to get into machine learning?

I’m currently in high school looking to get into machine learning and I was wondering what I should read or what courses I should take.

I am planning on taking the MITx Python course and reading the 2nd edition of Aurélian Géron’s book sometime after it is released. I am also currently eyeing Andriy Burkov’s “The Hundred-Page Machine Learning Book”.

submitted by /u/thetylerwolf
[link] [comments]

[D] Most outlandish application of Transformer Architechture

I’m conducting some independent research on the effectiveness of Transformer, Attention, GPT, BERT structure on tasks outside the domain of Language. I was curious to know what the most outlandish implementation you have done may be or the coolest cross-domain application you can think of. Lets see how much we can Transform the Transformer!

submitted by /u/ThatAi_guy
[link] [comments]

[D] Numenta (neurocortical theory group behind the Thousand Brain Theory of Intelligence) is doing an AMA on /r/neuroscience.

Link here.

Joining us is Matt Taylor (/u/rhyolight), who is /u/Numenta‘s community manager. He’ll be answering the bulk of the questions here, and will refer any more advanced neuroscience questions to Jeff Hawkins, Numenta’s Co-Founder.

We are on a mission to figure out how the brain works and enable machine intelligence technology based on brain principles. We’ve made significant progress in understanding the brain, and we believe our research offers opportunities to advance the state of AI and machine learning.

Despite the fact that scientists have amassed an enormous amount of detailed factual knowledge about the brain, how it works is still a profound mystery. We recently published a paper titled A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex that lays out a theoretical framework for understanding what the neocortex does and how it does it. It is commonly believed that the brain recognizes objects by extracting sensory features in a series of processing steps, which is also how today’s deep learning networks work. Our new theory suggests that instead of learning one big model of the world, the neocortex learns thousands of models that operate in parallel. We call this the Thousand Brains Theory of Intelligence.

The Thousand Brains Theory is rich with novel ideas and concepts that can be applied to practical machine learning systems and provides a roadmap for building intelligent systems inspired by the brain. I am excited to be a part of this mission! Ask me anything about our theory, code, or community.

Relevant Links:

  • Past AMA:
    /r/askscience previously hosted Numenta a couple of months ago. Check for further Q&A.
  • Numenta HTM School:
    Series of videos introducing HTM Theory, no background in neuro, math, or CS required.

submitted by /u/blueneuronDOTnet
[link] [comments]

[P] Deploy GPT-2 on AWS

I wrote a post about how I deployed OpenAI’s GPT-2 as a web API on my AWS account. I used code from the OpenAI repo to download and export the model and Cortex to run it on AWS.

You can use this command to test out the API:

curl -k -X POST -H "Content-Type: application/json" -d '{"samples":[{"text": "machine learning"}]}' https://aefa719d5c44011e9adc30ea9bac8e9a-1873012518.us-west-2.elb.amazonaws.com/text/generator 

submitted by /u/ospillinger
[link] [comments]