RAFA-NET

Code used in the research of estimating head pose orientation from RGB images. Code to build and train RAFA-Net are provided. The model takes an input image with a face bounding box and outputs the yaw, pitch and roll of the persons head in radians.

Published Results

Grad-CAM outputs of the 3 angles

Paper

Final ACCV Version

Abstract

Head pose is a vital indicator of human attention and behavior. Therefore, automatic estimation of head pose from images is key to many applications. In this paper, we propose a novel approach for head pose estimation from a single RGB image. Many existing approaches often predict head poses by localizing facial landmarks and then solve 2D to 3D correspondence problem with a mean head model. Such approaches rely entirely on the landmark detection accuracy, an ad-hoc alignment step, and the extraneous head model. To address this drawback, we present an end-to-end deep network, which explores rotation axis (yaw, pitch and roll) focused innovative attention mechanism to capture the subtle changes in images. The mechanism uses attentional spatial pooling from a self-attention layer and learns the importance over fine-grained to coarse spatial structures and combine them to capture rich semantic information concerning a given rotation axis. The evaluation of our approach using three benchmark datasets is very competitive to state-of-the-arts, including with and without landmark-based methods

Dependencies

Python Modules

The code was written in python 3.6.5 and run on Ubuntu 18.04.4. All requirements can be installed by running the following command:

pip install -r requirements.txt

Keras 2.2.4
TensorFlow 1.13.1
OpenCV 4.2.0
SelfAttention 0.46.0
scikit-learn

Datasets

3 datasets were used:

300W-LP & AFLW2000 - http://www.cbsr.ia.ac.cn/users/xiangyuzhu/projects/3DDFA/main.htm
BIWI - https://www.kaggle.com/kmader/biwi-kinect-head-pose-database

Bounding box information for all datasets can be found at: https://github.com/MingzhenShao/HeadPose

Running the Code

The model can be created by running:

python train_rafa-net.py

By default the model will train on 300W-LP and test on AFLW2000 (Line number 350-354 in train_rafa-net.py).

Citation

Behera, A., Wharton, Z., Hewage, P., Kumar, S., 2020. Rotation Axis Focused Attention Network (RAFA-Net) for Estimating Head Pose. In: Asian Confernce on Computer Vision 2020, 30 Nov-4 Dec 2020.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
doc		doc
RAFA_Net_ACCV20.pdf		RAFA_Net_ACCV20.pdf
README.md		README.md
SelfAttention.py		SelfAttention.py
SpectralNormalizationKeras.py		SpectralNormalizationKeras.py
attentional_spatial_pooling.py		attentional_spatial_pooling.py
custom_validate_callback.py		custom_validate_callback.py
pose_data_augmentor.py		pose_data_augmentor.py
requirements.txt		requirements.txt
train_rafa-net.py		train_rafa-net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAFA-NET

Published Results

Grad-CAM outputs of the 3 angles

Paper

Abstract

Dependencies

Python Modules

Datasets

Running the Code

Citation

About

Releases

Packages

Contributors 2

Languages

ArdhenduBehera/RAFA-Net

Folders and files

Latest commit

History

Repository files navigation

RAFA-NET

Published Results

Grad-CAM outputs of the 3 angles

Paper

Abstract

Dependencies

Python Modules

Datasets

Running the Code

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages