BitFlippingEnv argument check and docs clarification #1698

kylesayrs · 2023-09-27T05:53:55Z

Description

Create a helper function _make_observation_space which handles creating the observation space and checking that discrete_obs_space and image_obs_space do not conflict
Rename self.obs_space to self._obs_space to clarify it is for managing the internal state only. Added a comment explaining this
Better explain what each of the observation spaces mean in the documentation

Motivation and Context

These changes make it clearer to users how the internal obs_space member is being used as well as what each of the observation space variants mean.
closes #1691

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist

Note: You can run most of the checks using make commit-checks.
Note: we are using a maximum length of 127 characters per line

I have checked the example in the documentation works as expected (modules/her.rst)

Given that the changes are minor, I do not think it warrants a dedicated unit test (correct me if I'm wrong). You can verify the environment is constructed, trains, and evaluates correctly using the below script.

from stable_baselines3 import HerReplayBuffer, DDPG, DQN, SAC, TD3
from stable_baselines3.her.goal_selection_strategy import GoalSelectionStrategy
from stable_baselines3.common.envs import BitFlippingEnv


def test(model_class, discrete_obs_space, image_obs_space):
    model_class = DDPG  # works also with SAC, DDPG and TD3
    N_BITS = 15

    env = BitFlippingEnv(
        n_bits=N_BITS,
        continuous=model_class in [DDPG, SAC, TD3],
        discrete_obs_space=discrete_obs_space,
        image_obs_space=image_obs_space,
        max_steps=N_BITS
    )

    # Available strategies (cf paper): future, final, episode
    goal_selection_strategy = "future" # equivalent to GoalSelectionStrategy.FUTURE

    # Initialize the model
    model = model_class(
        "MultiInputPolicy",
        env,
        replay_buffer_class=HerReplayBuffer,
        # Parameters for HER
        replay_buffer_kwargs=dict(
            n_sampled_goal=4,
            goal_selection_strategy=goal_selection_strategy,
        ),
        verbose=1,
    )

    # Train the model
    model.learn(10)

    model.save("./her_bit_env")
    # Because it needs access to `env.compute_reward()`
    # HER must be loaded with the env
    model = model_class.load("./her_bit_env", env=env)

    obs, info = env.reset()
    for _ in range(10):
        action, _ = model.predict(obs, deterministic=True)
        obs, reward, terminated, truncated, _ = env.step(action)
        if terminated or truncated:
            obs, info = env.reset()


if __name__ == "__main__":
    for discrete_obs_space in [True, False]:
        for image_obs_space in [True, False]:
            for model_class in [DQN, DDPG, SAC, TD3]:
                try:
                    test(model_class, discrete_obs_space, image_obs_space)
                except ValueError as exception:
                    if (discrete_obs_space and image_obs_space):
                        continue
                    else:
                        print("Failed to raise error")
                        exit(1)

    print("Success")

araffin

LGTM, thanks =)

kylesayrs · 2023-09-27T17:17:43Z

@araffin Thanks for your help!

kylesayrs added 4 commits September 26, 2023 00:28

made change, not tested yet

968b707

add back _obs_space with note on purpose

1cd84d9

match formatting

f388992

update documentation

8cb2ed3

araffin approved these changes Sep 27, 2023

View reviewed changes

araffin merged commit fab6cb3 into DLR-RM:master Sep 27, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BitFlippingEnv argument check and docs clarification #1698

BitFlippingEnv argument check and docs clarification #1698

kylesayrs commented Sep 27, 2023

araffin left a comment

kylesayrs commented Sep 27, 2023

BitFlippingEnv argument check and docs clarification #1698

BitFlippingEnv argument check and docs clarification #1698

Conversation

kylesayrs commented Sep 27, 2023

Description

Motivation and Context

Types of changes

Checklist

araffin left a comment

Choose a reason for hiding this comment

kylesayrs commented Sep 27, 2023