Reproduce SAC with PARL

Based on PARL, the SAC algorithm of deep reinforcement learning has been reproduced, reaching the same level of indicators as the paper in Mujoco benchmarks.

Paper: SAC in Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Mujoco games introduction

PARL currently supports the open-source version of Mujoco provided by DeepMind, so users do not need to download binaries of Mujoco as well as install mujoco-py and get license. For more details, please visit Mujoco

Benchmark result

Each experiment was run three times with different seeds

How to use

Dependencies:

python3.7+
parl>=2.1.1
paddlepaddle>=2.0.0
gym>=0.26.0
mujoco-py>=2.2.2

Start Training:

Train

# To train for HalfCheetah-v4(default),Hopper-v4,Walker2d-v4,Ant-v4
# --alpha 0.2(default)
python train.py --env [ENV_NAME]

# To reproduce the performance of Humanoid-v4
python train.py --env Humanoid-v4 --alpha 0.05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Reproduce SAC with PARL

Mujoco games introduction

Benchmark result

How to use

Dependencies:

Start Training:

Train

Files

README.md

Latest commit

History

README.md

File metadata and controls

Reproduce SAC with PARL

Mujoco games introduction

Benchmark result

How to use

Dependencies:

Start Training:

Train