Name		Name	Last commit message	Last commit date
parent directory ..
imgaes		imgaes
README.md		README.md
uwac.py		uwac.py
uwac_impl.py		uwac_impl.py

README.md

1. Introduction

Uncertainty Weighted Actor-Critic (UWAC) is an offline Reinforcement Learning algorithm, which detects OOD state-action pairs and down-weights their contribution in the training objectives accordingly. In offline Reinforcement Learning, whether the data is OOD can be judged by estimating the epistemic uncertainty. Further, using the estimated uncertainty as a regularization term for training critic penalizes the Q-value of OOD data. However, such an approach would limit the generalization ability of the Q function. Therefore, based on the BEAR algorithm, UWAC [1] is proposed to penalize the Q-value of OOD data by using the estimated uncertainty as weights to reduce the weights of updating from OOD samples. UWAC first uses MC-dropout to estimate the uncertainty, and further uses the weighted loss as shown below for RL learning.

2. Instruction

python uwac-train.py --dataset=walker2d-random-v2 --seed=0 --gpu=0

3. Performance

4. Reference

Wu, Yue, et al. "Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning." International Conference on Machine Learning. PMLR, 2021.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UWAC

UWAC

README.md

1. Introduction

2. Instruction

3. Performance

4. Reference

Files

UWAC

Directory actions

More options

Directory actions

More options

Latest commit

History

UWAC

Folders and files

parent directory

README.md

1. Introduction

2. Instruction

3. Performance

4. Reference