GitHub - daniellawson9999/data-tests

Experimenting with Minari, through importing older D4RL datasets (tested in Python 3.10.11).

Port D4RL MuJoCo to Minari

Download D4RL and Minari:

clone https://github.com/Farama-Foundation/Minari . As Minari is under activate development, this conversion may need to be modified at some point. To use the version of Minari tested, one can use https://github.com/daniellawson9999/Minari.
clone D4RL https://github.com/Farama-Foundation/D4RL
Setup separate dependencies, e.g, conda environment for each repo

cd 4rl_mujoco_minari

Activate D4RL environment and run:

python mujoco_d4rl_to_pkl.py --dir={save_dir}

where save_dir is the directory to store D4RL .pkl files.

Activate Minari environment and run:

python mujoco_pkl_to_minari.py --dir={save_dir}

Where author, author_email, code_permalink can be added optionally.

Test loading new environments:

import minari

dataset = minari.load_dataset('d4rl_halfcheetah-expert-v2')

env = dataset.recover_environment() # Deprecated HalfCheetah-v3 environment

# Sample an episode
episode = dataset.sample_episodes(n_episodes=1)[0]

Port

New, port 1% of 1st seed of dqn-replay to Minari.

Minari:

pip install minari==0.4.1

Run:

python download_convert.py --convert

The dataset name follows the convention {game}-top1-s{index}-v0, which can be set by passing --index, which defaults to 1, matching the seed used by work like Scaled QL. Example of loading a dataset, where Breakout can be replaced with any dataset in ./atari_minari/atari_games.py.

import minari
from atari_minari.utils import create_atari_env

dataset = minari.load_dataset('Breakout-top1-s1-v0')

base_env = dataset.recover_environment() # Recommended to instead build env, as follows:
env = create_atari_env('ALE/Breakout-v5', repeat_action_probability=0.25, clip_rewards=False)
# disable action_repeat for some evaluation
env = create_atari_env('ALE/Breakout-v5', repeat_action_probability=0.0, clip_rewards=False)

# Sample an episode
episode = dataset.sample_episodes(n_episodes=1)[0]

(OLD) Port Atari to Minari

OLD procedure

Download D4RL and Minari:

clone https://github.com/Farama-Foundation/Minari (do not have to clone again if already followed setup in previous step)
clone Atari https://github.com/takuseno/d4rl-atari
Setup separate dependencies, e.g, conda environment for each repo
in Atari environment, run: pip install gym[atari], pip install gym[accept-rom-license]

Activate Atari environment and run:

python atari_to_pkl.py --dir={save_dir}

where save_dir is the directory to store D4RL .npz files.

To convert datasets, run: To test, activate Minari environment and run:

python atari_pkl_to_minari.py.py --dir={save_dir}

This will create dataset(s), with the name {env_name}-{dataset_type}_s{seed}-v0, where env_name is the name of the environment, e.g. Breakout. Seed and dataset_type follow from https://github.com/takuseno/d4rl-atari, where we test with expert, which contains datasets consisting of the last 1M steps of training. _s{seed} specified which trained agent to use, which is referred to as -v in Takuma's Github, but renamed to seed (_s) as -v is used to specify dataset version in Minari.

Example of loading a dataset:

import minari
from atari_minari.utils import create_atari_env

dataset = minari.load_dataset('Breakout-expert_s0-v0')

base_env = dataset.recover_environment() # Recommended to instead build env, as follows:
env = create_atari_env('ALE/Breakout-v5', repeat_action_probability=0.25, clip_rewards=True)
# disable action_repeat for some evaluation
env = create_atari_env('ALE/Breakout-v5', repeat_action_probability=0.0, clip_rewards=True)

# Sample an episode
episode = dataset.sample_episodes(n_episodes=1)[0]

There are several things to note:

dataset.recover_environment() will return the environment without reward_clipping due to issues serializing TransformReward(). To load with environment clipping, recreate the environment with create_atari_env() and pass clip_rewards=True
While the dataset Optimistic Perspective on Offline Reinforcement Learning is collected with repeat_action_probability=0.25, two recent papers, Multi-Game Decision Transformers, Scaled QL, which aim at creating generalist Atari agents use this dataset for training, but set repeat_action_probability=0.0 during evaluation.
Both the dataset,and the environment, return un-scaled 84x84 observations, with values ranging from 0 to 255. One should normalize these values before network input, such as by dividing observations by 255 to scale to 0 to 1, or use another normalization scheme.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
atari_minari		atari_minari
d4rl_mujoco_minari		d4rl_mujoco_minari
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Port D4RL MuJoCo to Minari

Port

(OLD) Port Atari to Minari

About

Releases

Packages

Languages

daniellawson9999/data-tests

Folders and files

Latest commit

History

Repository files navigation

Port D4RL MuJoCo to Minari

Port

(OLD) Port Atari to Minari

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages