RSAdapter

The official PyTorch implementation of the paper "RSAdapter: Adapting Multimodal Models for Remote Sensing Visual Question Answering".

If you find our work useful in your research, please cite:

@article{wang2024rsadapter,
  title={RSAdapter: Adapting Multimodal Models for Remote Sensing Visual Question Answering},
  author={Wang, Yuduo and Ghamisi, Pedram},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  year={2024},
  publisher={IEEE}
}

Introduction

In this work, we introduce a novel method known as RSAdapter, which prioritizes runtime and parameter efficiency. RSAdapter comprises two key components: the Parallel Adapter and an additional linear transformation layer inserted after each fully connected (FC) layer within the Adapter. This approach not only improves adaptation to pretrained multimodal models but also allows the parameters of the linear transformation layer to be integrated into the preceding FC layers during inference, reducing inference costs.

Preparation

Download the RSVQA and RSIVQA datasets.

Training

for RSVQA-LR dataset
- Change the default path of image files

python train_lr.py

for RSVQA-HR dataset
- Change the default path of image files

python train_hr.py

for RSIVQA dataset
- Change the default path of image files
- Since RSIVQA comprises multiple datasets with varying image sizes, we first resize all images to a unified size of 256 × 256 before feeding them into the model. Please resize images before training the model on RSIVQA dataset.

python train_rsi.py

RSAdapter is implemented in

RSAdapter/src/t/src/transformers/models/vilt/modeling_vilt_test.py

Line 479 in 6a76278

class RSAdapter(nn.Module):
RSadapter is added to the vilt model in

RSAdapter/src/t/src/transformers/models/vilt/modeling_vilt_test.py

Line 546 in 6a76278

# parallel attn adapter

and

RSAdapter/src/t/src/transformers/models/vilt/modeling_vilt_test.py

Line 561 in 6a76278

layer_output = layer_output + mlp_adapter * self.mlp_adapter_scale

COMPARISON WITH SOTA

TODO

Add Inference code

Acknowledgement

The codes are based on transformers. The authors would also like to thank the contributors to the RSVQA and RSIVQA datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Figure		Figure
src		src
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RSAdapter

Introduction

Preparation

Training

COMPARISON WITH SOTA

TODO

Acknowledgement

About

Releases

Packages

Languages

Y-D-Wang/RSAdapter

Folders and files

Latest commit

History

Repository files navigation

RSAdapter

Introduction

Preparation

Training

COMPARISON WITH SOTA

TODO

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages