Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

[P] Agent Learns to Park a Car using Unity ML-Agents / Deep Reinforcement Learning (PPO)

[P] Agent Learns to Park a Car using Unity ML-Agents / Deep Reinforcement Learning (PPO)

Video available here (YouTube): https://youtu.be/VMp6pq6_QjI

Context:

An AI learns to park a car in a parking lot in a 3D physics simulation. The simulation was implemented using Unity’s ML-Agents framework (https://unity3d.com/machine-learning). The AI consists of a deep Neural Network with 3 hidden layers of 128 neurons each. It is trained with the Proximal Policy Optimization (PPO) algorithm.

The input of the Neural Network are the readings of eight depth sensors, the cars current speed and position, as well as its relative position to the target. The outputs of the Neural Network are interpreted as engine force, braking force and turning force (continuous values). These outputs can be seen at the top right corner of the zoomed out camera shots.

The AI starts off with random behaviour, i.e. the Neural Network is initialized with random weights. It then gradually learns to solve the task by reacting to environment feedback accordingly.

The AI is rewarded with small positive signals for getting closer to the parking spot, which is outlined in red, and gets a larger reward when it actually reaches the parking spot and stops there. The final reward for reaching the parking spot is dependent on how parallel the car stops in relation to the actual parking position. If the car stops in a 90° angle to the actual parking direction for instance, the AI will only be rewarded a very small amount, relative to the amount it would get for stopping completely parallel to the actual direction. The AI is penalized with a negative reward signal, when it either drives further away from the parking spot or if it crashes into any obstacles.

https://i.redd.it/cul3s0nanrk31.png

submitted by /u/SamuelArzt
[link] [comments]

[P] Content Update in NLP Tutorial repo : Text Classification on HuffPost news article

Content Update in PyTorch NLP Tutorial repo.

Text Classification, with simple annotation.

  • Dataset: HuffPost news corpus including corresponding category.
  • Pre-trained word vectors: How pre-trained word representations affect model performance (via ablation study)

The model trained on this dataset identify the category of news article based on their headlines and descriptions.

link : https://github.com/lyeoni/nlp-tutorial/tree/master/news-category-classifcation

submitted by /u/lyeoni
[link] [comments]

[D] Is Google MediaPipe the future of ML?

I had an idea a long time ago that would allow us to make “soft AGI” by making models into “nodes” that would have an input and output representations and routing them to achieve the task needed.

Basically a “marketplace” of models that would have the metadata to be easily searchable and standardized types/representations that would allow them to link seamlessly. We could then build a system that would get a text query and phone sensors as input and would determine input-output representations for the query. The system would then find a route (ideally using the most cost-efficient path, nodes being rated with a compute cost) to achieve the task.

Is Google trying to make this with MediaPipe?

I think once it is ou of beta they will open a repository for our “Calculators” (as they call nodes) to make this happen. They could maybe provide cheap or free execution on Google Cloud but use the data passing through as training material. The thing is we could make models more modular and reuse pretrained nodes already on the system. It would make training really fast as most of the model is already pretrained and avoid overfitting because the nodes are used for other tasks as well.

What do you think of it?

submitted by /u/hapliniste
[link] [comments]