本项目仅为中文区的CVers提供快速实装VMamba视觉表征模型的参考例。
本项目以代码差分方式提交,请务必参考:
本项目基于2024/05/30版本的VMamba代码和VHeat代码编写。
本项目组织方式:
- 把"VMamba/classification/models/"文件夹作为"mmrotate/models/backbones/vmamba_models/"文件夹,把"vHeat/detection/vHeat/"文件夹作为"mmrotate/models/backbones/vheat_models/"文件夹
- 把"VMamba/detection/model.py"文件作为"mmrotate/models/backbones/vmamba_model.py"文件,把"vheat/detection/model.py"文件作为"mmrotate/models/backbones/vheat_model.py"文件,并修改"__init__.py"
- 制作vmamba和VHeat的config文件
- 迁移"VMamba/kernels/"文件夹
【说明】我们推荐使用mmrotate-0.3.3/0.3.4版本,它是一个较为稳定的版本。mmrotate-dev-1.x版本是基于mmcv-2与mmdet-3编写的未来主流版本,但它的多尺度测试可能存在一些问题。
# 受到 https://github.com/state-spaces/mamba 要求:PyTorch 1.12+ CUDA 11.6+
# wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda_11.7.1_515.65.01_linux.run
# chmod +x ./cuda_11.7.1_515.65.01_linux.run
# sudo sh cuda_11.7.1_515.65.01_linux.run
wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux.run
chmod +x ./cuda_11.6.2_510.47.03_linux.run
sudo ./cuda_11.6.2_510.47.03_linux.run
#
# vi ~/.bashrc
# Add CUDA path
# export PATH=/usr/local/cuda-11.7/bin:$PATH
# export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.6/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64:$LD_LIBRARY_PATH
export NCCL_P2P_DISABLE="1"
export NCCL_IB_DISABLE="1"
#
source ~/.bashrc
nvcc -V
#
# NO sudo when install anaconda
# chmod +x ./Anaconda3-2023.09-0-Linux-x86_64.sh
# ./Anaconda3-2023.09-0-Linux-x86_64.sh
#
# conda create -n openmmlab1131 python=3.9 -y
# conda activate openmmlab1131
# # ref: https://pytorch.org/get-started/previous-versions/#v1131
# conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
conda create -n openmmlab1121 python=3.8 -y
conda activate openmmlab1121
# ref: https://pytorch.org/get-started/previous-versions/#v1121
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
#
pip install shapely tqdm timm
#
# # if mmrotate-dev-1.x
# pip install -U openmim
# mim install mmengine
# # 受到 dev-1.x 要求,可以安装 mmcv==2.0.0rc2 和 mmdet==3.0.0rc6 之后的那个版本
# mim install mmcv==2.0.1
# mim install mmdet==3.1.0
# # 我们所使用mmrotate-dev-1.x版本的提交码是fd60beff130a54e284a73651903de29fe728f97b,请注意核对
# git clone https://github.com/open-mmlab/mmrotate.git -b dev-1.x
# cd mmrotate
# pip install -r requirements/build.txt
# pip install -v -e .
#
# if mmrotate-0.3.3/0.3.4
pip install openmim
mim install mmcv-full==1.6.1
mim install mmdet==2.25.1
git clone https://github.com/open-mmlab/mmrotate.git
cd mmrotate
pip install -r requirements/build.txt
pip install -v -e .
#
# 为mmrotate-0.3.3/0.3.4降低部分包的版本
pip install numpy==1.21.5
pip install yapf==0.40.1
#
# 安装必要的vmamba依赖
pip install einops fvcore triton ninja
cd kernels/selective_scan/ && pip install . && cd ../../
使用官网下载的数据集解压创建
# 请先准备好与mmrotate主目录并列的mmrotate-data和mmrotate-tools文件夹
# 以下命令均在mmrotate主目录下执行
#
# ln -s /Workspace/Dataset/DOTA/ ../mmrotate-data/data/
#
ln -s ../mmrotate-data/data/ ./
ln -s ../mmrotate-data/work_dirs/ ./
#
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/train.tar.gz
# cfb5007ada913241e02c24484e12d5d2
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/val.tar.gz
# a53e74b0d69dacf3ffcb438accd60c45
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/test/part1.zip
# d3028e48da64b37ad2f2f5f31059e0da
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/test/part2.zip
# 99f779850cc44b8f8b28d348494c6b41
#
tar -xzf ./data/DOTA/train.tar.gz -C ./data/DOTA/
tar -xzf ./data/DOTA/val.tar.gz -C ./data/DOTA/
unzip ./data/DOTA/test/part1.zip -d ./data/DOTA/test/
unzip ./data/DOTA/test/part2.zip -d ./data/DOTA/test/
#
python ../mmrotate-tools/dir_list.py --path ./data/DOTA/train/images/ --output ./data/DOTA/train/trainset.txt
# 1411
python ../mmrotate-tools/dir_list.py --path ./data/DOTA/val/images/ --output ./data/DOTA/val/valset.txt
# 458
python ../mmrotate-tools/dir_list.py --path ./data/DOTA/test/images/ --output ./data/DOTA/test/testset.txt
# 937
mmrotate分割处理
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_train.json
# Total images number: 15749
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_val.json
# Total images number: 5297
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_trainval.json
# Total images number: 21046
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_test.json
# Total images number: 10833
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ms_trainval.json
# Total images number: 138883
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ms_test.json
# Total images number: 71888
预训练模型目录为./data/pretrained/
sudo apt install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace
多卡训练:
# if mmrotate-0.3.3/0.3.4
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py 2
CUDA_VISIBLE_DEVICES=0,1 nohup ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py 2 > nohup.log 2>&1 &
#
# if mmrotate-dev-1.x
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated-faster-rcnn-le90_r50_fpn_1x_dota.py 2
CUDA_VISIBLE_DEVICES=0,1 nohup ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated-faster-rcnn-le90_r50_fpn_1x_dota.py 2 > nohup.log 2>&1 &
多卡合并测试:
# if mmrotate-0.3.3/0.3.4
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_test.sh ./configs/_rotated_faster_rcnn_/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py ./work_dirs/rotated_faster_rcnn_r50_fpn_1x_dota_le90/rotated_faster_rcnn_r50_fpn_1x_dota_le90-0393aa5c.pth 2 --format-only --eval-options submission_dir="./work_dirs/Task1_r50_033"
python "../DOTA_devkit-master/dota_evaluation_task1.py" --mergedir "./work_dirs/Task1_r50_033/" --imagesetdir "./data/DOTA/val/" --use_07_metric True
# map: 0.820117064577964
#
# if mmrotate-dev-1.x
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_test.sh ./configs/_rotated_faster_rcnn_/rotated-faster-rcnn-le90_r50_fpn_1x_dota.py ./work_dirs/rotated_faster_rcnn_r50_fpn_1x_dota_le90/rotated_faster_rcnn_r50_fpn_1x_dota_le90-0393aa5c.pth 2
python "../DOTA_devkit-master/dota_evaluation_task1.py" --mergedir "./work_dirs/Task1_rotated-faster-rcnn-le90_r50_fpn_1x_dota/" --imagesetdir "./data/DOTA/val/" --use_07_metric True
# map: 0.8193743727960783
Params&FLOPs计算:
python ./tools/analysis_tools/get_flops.py ./configs/rotated_faster_rcnn/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py
所有的训练和测试均在4×A100卡上进行。
- 表中split mAP是对ss-val的评测;merge mAP是对ss-val或ms-test的评测;
- 表中VMamba的FLOPs暂无计算。
- 表中VHeat的性能暂不发布。
Detector | backbone_size | batch_size | init_lr | split mAP | merge mAP |
Training Cost |
Testing FPS |
Params | FLOPs | Configs |
Rotated RetinaNet (1x ss) |
Swin-T | 4*4 | 1e-4 | 67.14 | 68.68 | 0.7h | 106.6 | 37.13M | 222.08G | cfg |
Swin-S | 4*4 | 1e-4 | 67.54 | 69.66 | 1.9h | 61.1 | 58.45M | 314.82G | cfg | |
Swin-B | 4*4 | 1e-4 | 68.48 | 70.56 | 2.7h | 45.3 | 97.06M | 461.50G | cfg | |
VHeat-T | 4*4 | 1e-4 | - | - | - | - | - | - | cfg | |
VHeat-S | 4*4 | 1e-4 | - | - | - | - | - | - | cfg | |
VHeat-B | 4*4 | 1e-4 | - | - | - | - | - | - | cfg | |
VMamba-T | 4*4 | 1e-4 | 69.15 | 71.11 | 1.0h | 91.9 | ? | ? | cfg | |
VMamba-S | 4*4 | 1e-4 | 69.78 | 72.17 | 1.8h | 69.7 | ? | ? | cfg | |
VMamba-B | 4*4 | 1e-4 | 69.70 | 71.77 | 2.2h | 58.8 | ? | ? | cfg | |
Rotated Faster RCNN (1x ss) |
Swin-T | 4*4 | 1e-4 | 70.11 | 72.62 | 0.7h | 106.1 | 44.76M | 215.54G | cfg |
Swin-S | 4*4 | 1e-4 | 70.39 | 73.22 | 1.9h | 58.7 | 66.08M | 308.28G | cfg | |
Swin-B | 4*4 | 1e-4 | 71.73 | 73.91 | 2.7h | 44.1 | 104.11M | 455.35G | cfg | |
VHeat-T | 4*4 | 1e-4 | - | 73.16 | - | - | 49.71M | 219.11G | cfg | |
VHeat-S | 4*4 | 1e-4 | 72.86 | 73.65 | - | - | 71.08M | 3.6.28G | cfg | |
VHeat-B | 4*4 | 1e-4 | - | 73.56 | - | - | 111.65M | 451.47G | cfg | |
VMamba-T | 4*4 | 1e-4 | 73.13 | 74.04 | 1.1h | 84.0 | ? | ? | cfg | |
VMamba-S | 4*4 | 1e-4 | 73.14 | 74.16 | 1.9h | 63.3 | ? | ? | cfg | |
VMamba-B | 4*4 | 1e-4 | 73.30 | 73.50 | 2.3h | 56.1 | ? | ? | cfg | |
Oriented RCNN (1x ss) |
Swin-T | 4*4 | 1e-4 | 73.88 | 75.92 | 0.8h | 105.0 | 44.76M | 215.68G | cfg |
Swin-S | 4*4 | 1e-4 | 74.49 | 76.07 | 2.0h | 58.7 | 66.08M | 308.42G | cfg | |
Swin-B | 4*4 | 1e-4 | 74.88 | 76.16 | 2.8h | 44.1 | 104.11M | 455.49G | cfg | |
VHeat-T | 4*4 | 1e-4 | 74.85 | 76.56 | - | - | 49.71M | 219.11G | cfg | |
VHeat-S | 4*4 | 1e-4 | 74.96 | 76.20 | - | - | 71.08M | 306.42G | cfg | |
VHeat-B | 4*4 | 1e-4 | 74.58 | 76.54 | - | - | 111.65 | 451.60G | cfg | |
VMamba-T | 4*4 | 1e-4 | 75.95 | 76.59 | 1.1h | 79.9 | ? | ? | cfg | |
VMamba-S | 4*4 | 1e-4 | 76.10 | 76.70 | 1.9h | 62.4 | ? | ? | cfg | |
VMamba-B | 4*4 | 1e-4 | 76.27 | 76.23 | 2.3h | 54.7 | ? | ? | cfg | |
Oriented RCNN (1x msrr) |
Swin-T | 4*4 | 1e-4 | 88.45 | 81.36 | 4.6h | cfg | |||
Swin-S | 4*4 | 1e-4 | 89.58 | 81.08 | 12.5h | cfg | ||||
Swin-B | 4*4 | 1e-4 | 89.36 | 81.04 | 17.5h | cfg | ||||
VHeat-T | 4*4 | 1e-4 | 89.73 | 81.50 | - | cfg | ||||
VHeat-S | 4*4 | 1e-4 | 89.57 | 81.37 | - | cfg | ||||
VHeat-B | 4*4 | 1e-4 | 90.71 | 81.16 | - | cfg | ||||
VMamba-T | 4*4 | 1e-4 | 89.78 | 80.70 | 6.8h | cfg | ||||
VMamba-S | 4*4 | 1e-4 | 90.56 | 80.62 | 12.2h | cfg | ||||
VMamba-B | 4*4 | 1e-4 | 90.48 | 80.97 | 15.1h | cfg |
有错误请及时指出!虽然不会经常来看issue非常抱歉但是看到就一定会回复的。
更新代码好麻烦呜呜呜
Copyright (c) 2024 Marina Akitsuki. All rights reserved.
Date modified: 2024/06/03