Code for the paper Learning Latent Dynamic Robust Representations for World Models (ICML-24).
We presented a new framework to learn state representations and dynamics in the presence of exogenous noise. We introduced the masking strategy and latent reconstruction to eliminate redundant spatio-temporal information, and employed bisimulation principle to capture task-relevant information. Addressing co-training instabilities, we further developed a hybrid RSSM (HRSSM) structure.
You can install the dependencies with the following command:
bash setup/install_env.sh
To train the model in the paper, you can:
Run the following command:
python -u dreamer.py --configs dmc_vision --task dmc_walker_stand --seed 0 --logdir ./log
Download the videos labeled 'driving_car' in the Kinetics 400 dataset and run the following command:
python -u dreamer.py --configs dmc_vision --task dmc_walker_stand_video --seed 0 --logdir ./log
Run the following command, where {mode} is one of {color_easy, color_hard, video_easy, video_hard, sensor_cs, distracting_cs}:
python -u dreamer.py --configs dmc_vision --task dmc_walker_stand_{mode} --seed 0 --logdir ./log
Download the background assets from this link and run the following command:
python -u dreamer.py --configs realistic_maniskill --task rms_turn_faucet --seed 0 --logdir ./log
- our code is based on dreamerv3-torch
- the Distracted DeepMind Control Suite environment is adopted from DBC
- the Realistic Maniskill environment is adopted from RePo
- the DMC-GS environment is adopted from DMControl Generalization Benchmark
Please cite the paper Learning Latent Dynamic Robust Representations for World Models if you found the resources in the repository useful.