Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
Hi Reddit community, I’m currently self-learning/exploring reinforcement learning. I have downloaded a few codes to try out and to get a feel of the code. There is a piece of code [code A] about using A3C for CartPole-v0, and it manages to learn very well. And another piece of code [code B] that uses DQN for LunarLander-v2, it managed to train a smart agent too.
Then I change the environment in code A (uses A3C) to LunarLander-v2 and MountainCar-v0, there weren’t any errors, but the agent fails to learn. Likewise, I change the environment in code B (uses DQN) to CartPole-v0 and MountainCar-v0, it didn’t learn as well.
Why is it so? Is it because different environments have different rewards system? Or the hyperparameters that worked for CartPole-v0 does not work for LunarLander-v2?
submitted by /u/ErmJustSaying
[link] [comments]