Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] (on-policy) exploration when adding new actions

I am using policy gradient DRL with on-policy exploration in a discrete domain.

After some-time, with significant exploration, with a decent network performance, I have to handle newly discovered actions. I can “widen” and initialize the network to handle these actions.

is there recommendation for increasing the exploration rate, and specifically “over-exploring” these new actions?

The data domain itself is structured/tabular/wide.

submitted by /u/so_tiredso_tired
[link] [comments]