Blog

Learn About Our Meetup

4500+ Members

[D] Can someone explain to me how in the reinforcement learning algorthim, A3C, how the multiple workers enusre they won’t retrieve the same parameters from the global network they just updated?

I understand that the multiple workers do gradient update to the global network is done asynchronously in A3C ( https://arxiv.org/abs/1602.01783 ).

But how do the workers ensure that they won’t retrieve the same parameters from the global network they just updated?

Thank you.

submitted by /u/ml4564
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat