Skip to content
/ SGRL Public

[ICML 2023 Oral] Official environments and implementations for "Subequivariant Graph Reinforcement Learning in 3D Environments"

License

Notifications You must be signed in to change notification settings

alpc91/SGRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Subequivariant Graph Reinforcement Learning in 3D Environment

ICML 2023 Oral

Runfa Chen1, Jiaqi Han1, Fuchun Sun1 2, Wenbing Huang3 4

1Department of Computer Science and Technology, Institute for AI, BNRist Center, Tsinghua University, 2THU-Bosch JCML Center, 3Gaoling School of Artificial Intelligence, Renmin University of China, 4Beijing Key Laboratory of Big Data Management and Analysis Methods

This is a PyTorch-based implementation of our Subequivariant Graph Reinforcement Learning. In this work, we introduce a new morphology-agnostic RL benchmark that extends the widely adopted 2D-Planar setting to 3D-SGRL, permitting significantly larger exploring space of the agents with arbitrary initial location and target direction. To learn a policy in this massive search space, we design SET, a novel model that preserves geometric symmetry by construction. Experimental results strongly support the necessity of encoding symmetry into the policy network and its wide applicability towards learning to navigate in various 3D environments.

If you find this work useful in your research, please cite using the following BibTeX:

@inproceedings{chen2023sgrl,
    title = {Subequivariant Graph Reinforcement Learning in 3D Environment},
    author = {Chen, Runfa and Han, Jiaqi and Sun, Fuchun and Huang, Wenbing},
    booktitle={International Conference on Machine Learning},
    year={2023},
    organization={PMLR}
    }

Setup

Requirements

Installing Dependencies

pip install --upgrade pip
pip install -r requirements.txt

Running Code

Flags and Parameters Description
--env_name <STRING> The name of the experiment project folder and the project name in wandb
--morphologies <STRING> Find existing environments matching each keyword for training (e.g. walker, hopper, humanoid, cheetah, whh, cwhh, etc)
--expID <STRING> Experiment Name for creating saving directory
--exp_path <STRING> The directory path where the experimental results are saved
--config_path <STRING> The path to the configuration file
--gpu <INT> The GPU device ID (e.g., 0, 1, 2, 3, etc)
--custom_xml <PATH> Path to custom xml file for training the morphology-agnostic policy.
When <PATH> is a file, train with that xml morphology only.
When <PATH> is a directory, train on all xml morphologies found in the directory
--actor_type <STRING> Type of the actor to use (e.g., smp, swat, set, mlp, etc)
--critic_type <STRING> Type of the critic to use (e.g., smp, swat, set, mlp, etc)
--seed <INT> (Optional) Seed for Gym, PyTorch and Numpy

Train with existing environment

  • Train SET on 3D_Hopper++ (3 variants of hopper):
cd src/
bash start.sh

3D-SGRL Environments

3D Hopper

3d_hopper_3_shin

3d_hopper_4_lower_shin

3d_hopper_5_full
3D Walker

3d_walker_2_right_leg_left_knee

3d_walker_3_left_leg_right_foot

3d_walker_4_right_knee_left_foot

3d_walker_5_foot

3d_walker_5_left_knee

3d_walker_7_full

3d_walker_3_left_knee_right_knee

3d_walker_6_right_foot
3D Humanoid

3d_humanoid_7_left_arm

3d_humanoid_7_lower_arms

3d_humanoid_7_right_arm

3d_humanoid_7_right_leg

3d_humanoid_8_left_knee

3d_humanoid_9_full

3d_humanoid_7_left_leg

3d_humanoid_8_right_knee
3D Cheetah

3d_cheetah_10_tail_leftbleg

3d_cheetah_11_leftfleg

3d_cheetah_11_tail_rightfknee

3d_cheetah_12_rightbknee

3d_cheetah_12_tail_leftbfoot

3d_cheetah_13_rightffoot

3d_cheetah_13_tail

3d_cheetah_14_full

3d_cheetah_11_leftbkneen_rightffoot

3d_cheetah_12_tail_leftffoot

For the results reported in the paper, the following agents are in the held-out set for the corresponding experiments:

  • 3D_Walker++: 3d_walker_3_left_knee_right_knee, 3d_walker_6_right_foot
  • 3D_Humanoid++: 3d_humanoid_7_left_leg, 3d_humanoid_8_right_knee
  • 3D_Cheetah++: 3d_cheetah_11_leftbkneen_rightffoot, 3d_cheetah_12_tail_leftffoot

All other agents in the corresponding experiments are used for training.

Acknowledgement

The RL code is based on this open-source implementation and the morphology-agnostic implementation is built on top of SMP (Huang et al., ICML 2020), Amorpheus (Kurin et al., ICLR 2021) and SWAT (Hong et al., ICLR 2022) repository.