Blog

Learn About Our Meetup

4500+ Members

[D] Can someone explain to me how in the reinforcement learning algorthim, A3C, how the multiple workers enusre they won’t retrieve the same parameters from the global network they just updated?

I understand that the multiple workers do gradient update to the global network is done asynchronously in A3C ( https://arxiv.org/abs/1602.01783 ).

But how do the workers ensure that they won’t retrieve the same parameters from the global network they just updated?

Thank you.

submitted by /u/ml4564
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat

 


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.