Why is episode done after 200 time steps (Gym environment MountainCar)?

When using the MountainCar-v0 environment from OpenAI-gym in Python the value done will be true after 200 time steps. Why is that? Because the goal state isn’t reached, the episode shouldn’t be done.

import gym
env = gym.make('MountainCar-v0')
env.reset()
for _ in range(300):
    env.render()
    res = env.step(env.action_space.sample())
    print(_)
    print(res[2])

I want to run the step method until the car reached the flag and then break the for loop. Is this possible? Something similar to this:

n_episodes = 10
done = False
for i in range(n_episodes):
    env.reset()
    while done == False:
        env.render()
        state, reward, done, _ = env.step(env.action_space.sample())

Answer

The current newest version of gym force-stops environment in 200 steps even if you don’t use env.monitor.
To avoid this, use
env = gym.make("MountainCar-v0").env

Attribution
Source : Link , Question Author : needRhelp , Answer Author : Scitator

Leave a Comment