Skip to content

Latest commit

 

History

History
95 lines (49 loc) · 5.46 KB

File metadata and controls

95 lines (49 loc) · 5.46 KB

Awesome-Remote-Sensing-Multimodal-Large-Language-Model (Vision-Language)

📢 A collection of remote sensing multimodal large language model papers focusing on the vision-language domain.

Author: Yang Zhan

School of Artificial Intelligence, OPtics, and ElectroNics (iOPEN), Northwestern Polytechnical University

Please share a STAR ⭐ if this project does help

📢 Latest Updates

In this repository, we will collect and document researchers and their outstanding work related to remote sensing multimodal large language model (vision-language).

  • The list will be continuously updated 🔥🔥
  • 📦 coming soon! 🚀

Content

Papers

  • 🔥 Apr-23-24: RS-LLaVA: A Large Vision-Language Model for Joint Captioning and Question Answering in Remote Sensing Imagery

Remote Sensing 2024 (doi: 10.3390/rs16091477). Y. Bazi, L. Bashmal, M. M. Al Rahhal, R. Ricci, and F. Melgani. [Paper][Code]

  • 🔥 Mar-29-24: H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model

arXiv 2024 (arXiv:2403.20213). C. Pang, W. Jiang, L. Jiayu, L. Yi, S. Jiaxing, L. Weijia, W. Xingxing, W. Shuai, F. Litong, X. Guisong, H.Conghui. [Paper][Code]

  • 🔥 Mar-6-24: Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery

arXiv 2024 (arXiv:2403.03790). W. Zhang, M. Cai, T. Zhang, G. Lei, Y. Zhuang, and X. Mao. [Paper][[Code]:Null]

  • 🔥 Feb-9-24: Large Language Models for Captioning and Retrieving Remote Sensing Images

arXiv 2024 (arXiv:2402.06475). J. D. Silva, J. Magalhaes, and D. Tuia. [Paper][[Code]:Null]

  • 🔥 Feb-4-24: LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model

arXiv 2024 (arXiv:2402.02544). D. Muhtar, Z. Li, F. Gu, X. Zhang, and P. Xiao. [Paper][Code]

  • 🔥 Jan-30-24: EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

arXiv 2024 (arXiv:2401.16822). W. Zhang, M. Cai, T. Zhang, Y. Zhuang, and X. Mao. [Paper][Code]

  • 🔥 Jan-18-24: SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model

arXiv 2024 (arXiv:2401.09712). Y. Zhan, Z. Xiong, and Y. Yuan. [Paper][Code]

  • 🔥 Nov-30-23: Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs

arXiv 2023 (arXiv:2311.14656). J. Roberts, T. Lüddecke, R. Sheikh, K. Han, and S. Albanie. [Paper][Code]

  • 🔥 Nov-28-23: GeoChat: Grounded Large Vision-Language Model for Remote Sensing

arXiv 2023 (arXiv:2311.15826). K. Kuckreja, M. S. Danish, M. Naseer, A. Das, S. Khan, and F. S. Khan. [Paper][Code]

  • 🔥 Jul-28-23: RSGPT: A Remote Sensing Vision Language Model and Benchmark

arXiv 2023 (arXiv:2307.15266). Y. Hu, J. Yuan, and C. Wen. [Paper][Code]

Remote Sensing Vision-Language Dataset

  • 🔥 Feb-17-24: ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing

arXiv 2024 (arXiv:2402.11325). Z. Yuan, Z. Xiong, L. Mou, and X. X. Zhu. [Paper][[Code]:Null)]

  • 🔥 Jan-2-24: RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing

arXiv 2023 (arXiv:2306.11300). Z. Zhang, T. Zhao, Y. Guo, and J. Yin. [Paper][Code]

  • 🔥 Dec-20-23: SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

AAAI 2024 (arXiv:2312.12856). Z. Wang, R. Prabha, T. Huang, J. Wu, and R. Rajagopal. [Paper][Code]

related: Remote Sensing Vision-Language Foundation Models

  • 🔥 Jan-2-24: RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing

arXiv 2023 (arXiv:2306.11300). Z. Zhang, T. Zhao, Y. Guo, and J. Yin. [Paper][Code]

  • 🔥 Dec-12-23: Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment

arXiv 2023 (arXiv:2312.06960). U. Mall, C. P. Phoo, M. K. Liu, C. Vondrick, B. Hariharan, and K. Bala. [Paper][[Code]:Null]

  • 🔥 Aug-10-23: RemoteCLIP: A Vision Language Foundation Model for Remote Sensing

arXiv 2023 (arXiv:2306.11029). F. Liu, D. Chen, Z. Guan, X. Zhou, J. Zhu, and J. Zhou. [Paper][Code]