Blog

Learn About Our Meetup

4200+ Members

[D] (on-policy) exploration when adding new actions

I am using policy gradient DRL with on-policy exploration in a discrete domain.

After some-time, with significant exploration, with a decent network performance, I have to handle newly discovered actions. I can “widen” and initialize the network to handle these actions.

is there recommendation for increasing the exploration rate, and specifically “over-exploring” these new actions?

The data domain itself is structured/tabular/wide.

submitted by /u/so_tiredso_tired
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat