Skip to content

HIT-SIRS/SMLFR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generative ConvNet Foundation Model with Sparse and Low-Frequency Filtered Masked Modeling for Remote Sensing Image Interpretation

Introduction

This is the official repository for the paper "Generative ConvNet Foundation Model with Sparse and Low-Frequency Filtered Masked Modeling for Remote Sensing Image Interpretation".

Abstract: Foundation models offer a highly versatile and precise solution for intelligent interpretation of remote sensing images, thus greatly facilitating various remote sensing applications. Nevertheless, current foundational models for remote sensing predominantly employ vision transformers based on generative methods, with no corresponding exploration of ConvNets with masked image modeling (MIM). In this paper, we make the first attempt to propose a generative ConvNet foundation model tailored for remote sensing scenarios, which comprises two key components: Firstly, a large dataset named GeoSense, containing approximately nine million diverse remote sensing images, is constructed to enhance the robustness and generalization of the foundation model during the pre-training phase. Secondly, a sparse and low-frequency filtered masked modeling (SLFFM) self-supervised learning framework is designed for representation learning of ConvNet foundation model. Specifically, we introduce sub-manifold sparse convolutions to enable the ConvNet to process variable-length sequences for MIM self-supervised pre-training. Additionally, a low-frequency filtered reconstruction target is designed to guide the model's attention towards essential ground object features in remote sensing images, while mitigating unnecessary detail interference. To evaluate the general performance of our proposed foundation model, comprehensive experiments have been carried out on five datasets across three downstream tasks (i.e., object detection, semantic segmentation, and change detection.). Experimental results demonstrate that our method consistently achieves state-of-the-art performance across all benchmark datasets and downstream tasks.

flowchart

Pre-trained and Fine-tuned Models

Pre-training

GeoSense

Pretrain Backbone Input Size Paramters Pretrained Model
SLFFM ConvNeXt-Base 224x224 89M Weights
SLFFM ConvNeXt-Large 224x224 198M Weights

Object Detection

Dota V1.0

Method Pre-train Backbone Lr Schd mAP Config Model
Oriented R-CNN SLFFM ConvNeXt-Base 1x 79.15 Config Weights
Oriented R-CNN SLFFM ConvNeXt-Large 1x 79.33 Config Weights

DIOR-R

Method Pre-train Backbone Lr Schd mAP Config Model
Oriented R-CNN SLFFM ConvNeXt-Base 1x 71.50 Config Weights
Oriented R-CNN SLFFM ConvNeXt-Large 1x 72.33 Config Weights

Semantic Segmentation

Potsdam

Method Pre-train Backbone Lr Schd OA Config Model
UperNet SLFFM ConvNeXt-Base 160k 91.72 Config Weights
UperNet SLFFM ConvNeXt-Large 160k 91.82 Config Weights

LoveDA

Method Pre-train Backbone Lr Schd mIoU Config Model
UperNet SLFFM ConvNeXt-Base 160k 52.59 Config Weights
UperNet SLFFM ConvNeXt-Large 160k 53.03 Config Weights

Change Detection

LEVIR-CD

Method Pre-train Backbone Lr Schd F1 Config Model
BIT SLFFM ConvNeXt-Base 20k 93.66 Config Weights
BIT SLFFM ConvNeXt-Large 20k 93.89 Config Weights

Usage

Environment

  • python 3.8.13
  • pytorch 1.12.1+cu113
  • torchvision 0.13.1+cu113
  • timm 0.6.12
  • mmdet 2.28.2
  • mmsegmentation 0.30.0
  • opencd 0.0.3

Pre-training

torchrun --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr=localhost --master_port=1234 main.py data_path=${DataPath} --exp_name=${ExpName} --exp_dir=${ExpDir} --model=${Model} --bs=1024 --init_weight=${InitWeight}

Finetune on Object Detection

Train:

bash tools/dist_train.sh ${ConfigPath} 8

Test:

bash tools/dist_test.sh ${ConfigPath} ${CheckpointPath} 8 --format-only --eval-options submission_dir=${SubmissionDir}

Finetune on Semantic Segmentation

Train:

bash tools/dist_train.sh ${ConfigPath} 8  

Test:

bash tools/dist_test.sh ${ConfigPath} ${CheckpointPath} 8 --eval 'mFscore' 'mIoU'

Finetune on Change Detection

Train:

bash tools/dist_train.sh ${ConfigPath} 8

Test:

bash tools/dist_test.sh ${ConfigPath} ${CheckpointPath} 8 --eval mFscore mIoU

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published