Joint Self-Attention for Denoising Monte Carlo Rendering

An official source code of The Visual Computer paper, "Joint Self-Attention for Denoising Monte Carlo Rendering".

For more information, please refer to our project page or other resources below:

Abstract Image-space denoising of rendered images has become a commonly adopted approach since this post-rendering process often drastically reduces required sample counts (thus rendering times) for producing a visually pleasing image without noticeable noise. It is a common practice to conduct such denoising while preserving image details by exploiting auxiliary information (e.g., G-buffers) as well as input colors. However, it is still challenging to devise an ideal denoising framework that fully considers the two inputs with different characteristics, e.g., noisy but complete shading information in the input colors and less noisy but partial shading information in the auxiliary buffers. This paper proposes a transformer-based denoising framework with a new self-attention mechanism that infers a joint self-attention map, i.e., self-similarity in input features, through dual attention scores: one from noisy colors and another from auxiliary buffers. We demonstrate that this separate consideration of the two inputs allows our framework to produce more accurate denoising results than state-of-the-art denoisers for various test scenes.

Network Architecture

The overview of our transformer block. Our proposed transformer takes two input feature maps, a color feature map $X$ and an auxiliary feature map $F$, and extracts dual self-attention scores, $S_X$ from the $X$ and $S_F$ from the $F$, to compute the output feature map $\hat{X}$ using the joint attention score $S_{X \cap F}=S_X \circ S_F$. We then apply a linear projection to the output feature map and pass it to the feed-forward network.

Setup

1. Environments (tested)

Unbuntu 18.04
Pytorch 1.8.1
GPU: NVIDIA RTX 3090
- driver version >= 525

Requirements

Please install docker and nvidia-docker.

Installation docker [guide]
Installation nividia-docker [guide]

2. Building docker

On a terminal, access the directory located on Dockerfile and run a command to build a docker image:
```
docker build -t joint_sa .
```

After building a docker image, create a container:

nvidia-docker run -v ${PWD}/data:/data -v ${PWD}/codes:/codes -it joint_sa;

Or, simply run run_docker.sh.

./run_docker.sh

Test with a pre-trained network

Setup configurations in config.py. Please move example data (download) to 'data/__test_scenes__/' and pretrained checkpoint (download) to 'data/jsa_pretrained/__checkpoints__/'

# =======================================
# Dataset
# =======================================
config["testDatasetDirectory"] = "../data/__test_scenes__/example"

config["test_input"] = 'input' 
config["test_target"] = 'target'
# =======================================
# Task
# =======================================
config["task"] = 'jsa_pretrained'

# =======================================
# Test parameter
# =======================================
config["load_epoch"] = "5800" #'jsa_pretrained'

Run a command on the terminal launched container (bash terminal for building docker, 2.2 of Setup).
- The result images will be saved in /data/jsa_pretrained/output_test.
```
python test.py
```

Retraining

Prepare a dataset
- We generated a training dataset that contains input/reference image pairs with auxiliary buffers (albedo, normal, and depth).
- We exploited the Tungsten renderer and used 32 spp and 32K spp for the input and reference images.
- The training dataset was generated from scenes and we randomly split the training dataset for training and validation.
- After generating the dataset, please move the training dataset into the training directory (../data/__train_scenes__/). For example, please move input data into input folder (e.g., ../data/__train_scenes__/example/input), and reference data into target folder (e.g., ../data/__train_scenes__/example/target).

Setup configurations in config.py.

# for test using pre-trained checkpoint
# config["multi_gpu"]  = False  # Please comment out this line, since we are not using a pre-trained net.

# =======================================
# Dataset
# =======================================
config["DataDirectory"] = "../data"
config["trainDatasetDirectory"] = "../data/__train_scenes__/example" # folder name to contain training dataset

config["train_input"] = 'input'
config["train_target"] = 'target'

# =======================================
# Task
# =======================================
config["task"] = 'jsa_trained' # folder name to contain training results

Run a command on the terminal (the bash terminal for building docker, 2.2 of Setup).
- After training, the result data will be saved in /data/config['task'].
```
python main_train.py
```

License

All source codes are released under a BSD License.

Citation

@article{JointSA2024, 
    title={Joint self-attention for denoising Monte Carlo rendering}, 
    author={Oh, Geunwoo and Moon, Bochang}, 
    journal={The Visual Computer}, 
    year={2024}, 
    month={Jun}, 
    issn={1432-2315},
    url={https://doi.org/10.1007/s00371-024-03446-8}, 
    DOI={10.1007/s00371-024-03446-8}, 
    abstractNote={Image-space denoising of rendered images has become a commonly adopted approach since this post-rendering process often drastically reduces required sample counts (thus rendering times) for producing a visually pleasing image without noticeable noise. It is a common practice to conduct such denoising while preserving image details by exploiting auxiliary information (e.g., G-buffers) as well as input colors. However, it is still challenging to devise an ideal denoising framework that fully considers the two inputs with different characteristics, e.g., noisy but complete shading information in the input colors and less noisy but partial shading information in the auxiliary buffers. This paper proposes a transformer-based denoising framework with a new self-attention mechanism that infers a joint self-attention map, i.e., self-similarity in input features, through dual attention scores: one from noisy colors and another from auxiliary buffers. We demonstrate that this separate consideration of the two inputs allows our framework to produce more accurate denoising results than state-of-the-art denoisers for various test scenes.}
}

Contact

If you have any questions, issues, and comments, please feel free to contact gnuo8325@gm.gist.ac.kr.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
codes		codes
.dockerignore		.dockerignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
joint_self-attention.png		joint_self-attention.png
run_docker.sh		run_docker.sh
teaser.png		teaser.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joint Self-Attention for Denoising Monte Carlo Rendering

Network Architecture

Setup

1. Environments (tested)

Requirements

2. Building docker

Test with a pre-trained network

Retraining

License

Citation

Contact

About

Releases

Packages

Languages

License

CGLab-GIST/joint-self-attetion

Folders and files

Latest commit

History

Repository files navigation

Joint Self-Attention for Denoising Monte Carlo Rendering

Network Architecture

Setup

1. Environments (tested)

Requirements

2. Building docker

Test with a pre-trained network

Retraining

License

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages