When using the MountainCar-v0 environment from OpenAI-gym in Python the value done will be true after 200 time steps. Why is that? Because the goal state isn’t reached, the episode shouldn’t be done.
import gym env = gym.make('MountainCar-v0') env.reset() for _ in range(300): env.render() res = env.step(env.action_space.sample()) print(_) print(res[2])
I want to run the step method until the car reached the flag and then break the for loop. Is this possible? Something similar to this:
n_episodes = 10 done = False for i in range(n_episodes): env.reset() while done == False: env.render() state, reward, done, _ = env.step(env.action_space.sample())
Answer
The current newest version of gym force-stops environment in 200 steps even if you don’t use env.monitor.
To avoid this, use
env = gym.make("MountainCar-v0").env
Attribution
Source : Link , Question Author : needRhelp , Answer Author : Scitator