Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] why the same reinforcement learning algorithm worked for MountainCar, but does not work for LunarLander (and others)

Hi Reddit community, I’m currently self-learning/exploring reinforcement learning. I have downloaded a few codes to try out and to get a feel of the code. There is a piece of code [code A] about using A3C for CartPole-v0, and it manages to learn very well. And another piece of code [code B] that uses DQN for LunarLander-v2, it managed to train a smart agent too.

Then I change the environment in code A (uses A3C) to LunarLander-v2 and MountainCar-v0, there weren’t any errors, but the agent fails to learn. Likewise, I change the environment in code B (uses DQN) to CartPole-v0 and MountainCar-v0, it didn’t learn as well.

Why is it so? Is it because different environments have different rewards system? Or the hyperparameters that worked for CartPole-v0 does not work for LunarLander-v2?

submitted by /u/ErmJustSaying
[link] [comments]