GitHub

Scaling Efficient Masked Autoencoder Learning on Large Remote Sensing Dataset

Fengxiang Wang¹    Hongzhen Wang^2,‡    Di Wang³    Zonghao Guo⁴
Zhenyu Zhong⁵    Long Lan^1,‡   Jing Zhang⁶ Zhiyuan Liu²    Maosong Sun²

¹ National University of Defense Technology    ²Tsinghua University     ³Wuhan University
⁴University of Chinese Academic of Sciences   ⁵Nankai University   ⁶The University of Sydney

Intruduction

RS-4M: A large-scale remote sensing dataset. This dataset, comprising 4 million optical images, is designed to fully leverage the representation learning capabilities of MIM methods in RS applications, distinguished by its diverse scene details.
SelectiveMAE: A novel and efficient MIM method tailored for remote sensing images. This method incorporates a new PSTS module, which significantly accelerates convergence and enhances representation learning compared to the original MIM approach.

Todo List

Initial release of checkpoint of SelectiveMAE. 🚀
Codes and configs for downstream tasks of SelectiveMAE, Scene Classification. 🚀
Codes and configs for downstream tasks of SelectiveMAE, Object Detection and Semantic Segmentation.
Pretraining codes and configs for SelectiveMAE will be released.
RS-4M dataset will be released.

Updates

[2024.06] - The training logs of the SelectiveMAE have been released.

RS-4M

RS-4M dataset contains about 4 million high-quality remote sensing optical images, which is four times larger than previous representative remote sensing datasets.

Examples of RS-4M

Experiments on RS-4M

RS-4M offers a significantly larger and more diverse image set compared to previous datasets. To evaluate its effectiveness, we pre-train a ViT-Base model using the vanilla MAE method. For comparison, we use the MillionAID dataset, maintaining an equal number of data points during training: 800 epochs for MillionAID's 1 million images and 200 epochs for our RS-4M dataset.

Dataset	Pretrained model	Images Number	Epoch	Sence Classification	Sence Classification	Object Detection	Object Detection	Semantic Segmentation	Semantic Segmentation
				AID	RESISC-45	DIOR	DIOR-R	LoveDA	SpaceNetv1
				OA (TR=20%/50%)	OA (TR=20%/50%)	mAP50	mAP50	mIoU	mF1
MillionAID	Weights	1 million	800	94.92/97.38	89.20/93.60	71.80	62.33	51.24	79.24
RS-4M	Weights	2 million	400	96.64/98.10	91.80/94.31	73.90	65.95	52.86	79.37
RS-4M	Weights	3 million	267	96.67/98.18	92.24/94.41	75.40	67.07	52.39	79.37
RS-4M	Weights	4 million	200	96.10/98.03	92.38/94.30	74.70	66.26	52.75	79.23
RS-4M	Weights	4 million	800	96.88/98.22	92.44/94.43	75.40	67.35	52.80	79.41

SelectiveMAE

⚙️ Installation

For details related to installation, kindly refer to INSTALL.md.

🚙 Pretraining

To learn more usage about the pretraining codes, kindly refer to PRETRAIN.md.

🚀 Results on downstream tasks

Model	Publication	Backbone	Sence Classification	Sence Classification	Object Detection	Object Detection	Semantic Segmentation	Semantic Segmentation
			AID	RESISC-45	DIOR	DIOR-R	LoveDA	SpaceNetv1
			OA (TR=20%/50%)	OA (TR=20%/50%)	mAP50	mAP50	mIoU	mF1
SeCo	ICCV'21	ResNet-50	93.47/95.99	89.64/92.91	-	-	43.63	77.09
GASSL	ICCV'21	ResNet-50	93.55/95.92	90.86/93.06	67.40	65.65	48.76	78.51
TOV	JSTARS'23	ResNet-50	95.16/97.09	90.97/93.79	70.16	66.33	49.70	-
CACo	CVPR'23	ResNet-50	90.88/95.05	88.28/91.94	66.91	64.10	48.89	77.94
SatMAE	NIPS'22	ViT-L	95.02/96.94	91.72/94.10	70.89	65.66	-	78.07
ScaleMAE	ICCV'23	ViT-L	96.44/97.58	92.63/95.04	73.81	66.47	-	-
SSL4EO	GRSM'23	ViT-S	91.06/94.74	87.60/91.27	64.82	61.23	-	-
RingMo	TGRS'22	Swin-B	96.90/98.34	94.25/95.67	75.90	-	-	-
SatLas	ICCV'23	Swin-B	94.96/97.38	92.16/94.70	74.10	67.59	-	-
GFM	ICCV'23	Swin-B	95.47/97.09	92.73/94.64	72.84	67.67	-	-
RVSA	TGRS'23	ViT-B+RVSA	97.03/98.50	93.93/95.69	75.80	68.06	51.95	-
SelectiveMAE	-	ViT-B	96.78/98.12	93.35/94.58	75.70	67.78	53.05	79.50
SelectiveMAE	-	ViT-L	97.25/98.48	94.57/95.77	77.80	70.31	54.31	79.46

License

This work is under the Apache License Version 2.0, while some specific operations in this codebase might be with other licenses. Please refer to LICENSE.md for a more careful check, if you are using our code for commercial matters.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Figures		Figures
docs		docs
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scaling Efficient Masked Autoencoder Learning on Large Remote Sensing Dataset

Intruduction

Todo List

Updates

Outline

RS-4M

Examples of RS-4M

Experiments on RS-4M

SelectiveMAE

⚙️ Installation

🚙 Pretraining

🚀 Results on downstream tasks

License

Acknowledgements

About

Releases

Packages

License

Fengxiang23/SelectiveMAE

Folders and files

Latest commit

History

Repository files navigation

Scaling Efficient Masked Autoencoder Learning on Large Remote Sensing Dataset

Intruduction

Todo List

Updates

Outline

RS-4M

Examples of RS-4M

Experiments on RS-4M

SelectiveMAE

⚙️ Installation

🚙 Pretraining

🚀 Results on downstream tasks

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages