Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] reward_lose not correctly set at timeout #103

Open
epignatelli opened this issue May 9, 2024 · 1 comment
Open

[BUG] reward_lose not correctly set at timeout #103

epignatelli opened this issue May 9, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@epignatelli
Copy link

epignatelli commented May 9, 2024

🐛 Bug

When setting max_episode_steps, while gym correctly triggers a reset, MiniHack (and perhaps even NLE) does not set StepStatus.ABORTED.

As a consequence, the reward for Timeout is not reward_lose.

To Reproduce

import gym
import gym.vector
import minihack

MAX_STEPS = 20

env = gym.make(
    "MiniHack-KeyRoom-S5-v0",
    reward_lose=-1.0,
    max_episode_steps=MAX_STEPS,
)
timestep = env.reset()
env.render()


for i in range(MAX_STEPS):
    timestep = env.step(3) 
    reward = timestep[1]
    info = timestep[-1]
    print(reward, info["end_status"])

assert int(info["end_status"]) == -1

Expected behavior

int(info["end_status"]) should be -1

Potential reasons

gym.make accepts max_episode_steps as argument.
For this reason, max_episode_steps is not included in kwargs and it is not passed through to the MiniHack constructor.

@epignatelli epignatelli added the bug Something isn't working label May 9, 2024
@epignatelli epignatelli changed the title [BUG] `max+ [BUG] max_episode_steps obscured by gym.make does not trigger ABORTED status May 9, 2024
@epignatelli
Copy link
Author

epignatelli commented May 9, 2024

You can temporarily patch it with this

class PatchTimeoutWrapper(gym.Wrapper):
    def __init__(self, env: gym.Env):
        super().__init__(env)
        self.unwrapped._max_episode_steps = self._max_episode_steps

And use:

env = gym.make(
    "MiniHack-KeyRoom-S5-v0",
    reward_lose=-1.0,
    max_episode_steps=MAX_STEPS,
)
env = PatchTimeoutWrapper(env)

@epignatelli epignatelli changed the title [BUG] max_episode_steps obscured by gym.make does not trigger ABORTED status [BUG] reward_lose not set at timeout May 17, 2024
@epignatelli epignatelli changed the title [BUG] reward_lose not set at timeout [BUG] reward_lose not correctly set at timeout May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant