[Feature] Implement of RAM with a gradio interface #1802

Coobiw · 2023-09-25T09:20:18Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

After implementing CLIPZeroShot in mmpretrain, zero-shot image classification is well done by it. However, if there is an image with multi-objects, CLIP cannot do well on these multi-classification tasks. Recently, image tagging is a hot topic. Tag2Text and RAM(Recognize Anything Model) can recognize multi-objects in one image well. What's more coincidental is the implement of RAM depends on implement of CLIP. So it's natural for this feature.

Modification

convert the checkpoint of RAM(especially, SwinTransformer) to mmpretrain style
implement RAM based on mmpretrain components and register it in MODELS interface
implement RAM inference in two modes (normal and openset(users can define the category for themselves))
the openset inference relies on mmpretrain CLIP
wrapped to a gradio interface for users to experence

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

Convert RAM weights to mmpretrain style(Optional)

python tools/model_converters/ram2mmpretrain.py /xxx/ram_swin_large_14m.pth /xxx/ram_swin_large_14m_mmpretrain.pth

Weights Preparation
You need to prepare RAM weights and CLIP weights. You can get them in ram_swin_large_14m_mmpretrain and clip-vit-b-p16_converted.
The step of converting CLIP weights is in my previous PR(Implement of Zero-Shot CLIP Classifier #1737).
Gradio Installation

pip install gradio==3.44.0

Launch Gradio WebUI

cd mmpretrain
python -m mmpretrain.models.multimodal.ram.gradio_demo /xxx/ram_swin_large_14m_mmpretrain.pth /xxx/clip-vit-b-p16_converted.pth

Demos

If you choose normal, you don't need to set the threshold and tag_list, just uploading the image, because these two are not used.

If you choose openset, the threshold must be set. The tag_list is optional(default category includes something rare and unseen).

Checklist

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues.
Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
CLA has been signed and all committers have signed the CLA in this PR.

…pen-mmlab#1756) * feat: impelemt DINO * chore: delete debug code * chore: impplement pre-commit * fix: fix imported package * chore: pre-commit check

…open-mmlab#1774) * add new config adapting MobileNetV2,V3 * add base model config for mobile net v3, modified all training configs of mobile net v3 inherit from the base model config * removed directory _base_/models/mobilenet_v3

* zero-shot CLIP * modify zero-shot clip config * add in1k_sub_prompt(8 prompts) for improvement * add some annotations doc * clip base class & clip_zs sub-class * some modifications of details after review * convert into and use mmpretrain-vit * modify names of some files and directories

codecov · 2023-09-25T09:30:36Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Files	Coverage Δ
configs/_base_/datasets/imagenet_bs128_mbv3.py	`100.00% <ø> (ø)`
configs/_base_/datasets/imagenet_bs32.py	`100.00% <ø> (ø)`
...onfigs/_base_/datasets/imagenet_bs32_pil_resize.py	`100.00% <ø> (ø)`
configs/_base_/datasets/imagenet_bs64_hivit_224.py	`100.00% <100.00%> (ø)`
configs/_base_/datasets/imagenet_bs64_swin_224.py	`100.00% <ø> (ø)`
configs/_base_/datasets/imagenet_bs64_swin_384.py	`100.00% <ø> (ø)`
configs/_base_/models/hivit/tiny_224.py	`100.00% <100.00%> (ø)`
...gs/_base_/schedules/imagenet_bs1024_adamw_hivit.py	`100.00% <100.00%> (ø)`
configs/dinov2/vit-base-p14_dinov2-pre_headless.py	`100.00% <100.00%> (ø)`
configs/sam/vit-base-p16_sam_headless.py	`100.00% <100.00%> (ø)`
... and 1 more

... and 189 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

LALBJ and others added 4 commits August 23, 2023 10:45

[CodeCamp2023-584]Support DINO self-supervised learning in project (o…

d2ccc44

…pen-mmlab#1756) * feat: impelemt DINO * chore: delete debug code * chore: impplement pre-commit * fix: fix imported package * chore: pre-commit check

ram init commit

51a2a15

mzr1996 and others added 8 commits October 8, 2023 15:44

[Fix] Fix pipeline bug in image retrieval inferencer

06bb586

[CodeCamp2023-341] 多模态数据集文档补充-COCO Retrieval

3bcf7e2

Update OFA to compat with latest huggingface.

b0a792e

Update train.py to compat with new config

4849324

Merge remote-tracking branch 'origin/main' into dev

d35c778

Bump version to v1.1.0

a4c219e

Merge remote-tracking branch 'origin/main' into pr1802/ram

4584a07

Update __init__.py

06bb1ea

mzr1996 approved these changes Oct 25, 2023

View reviewed changes

mzr1996 changed the base branch from main to dev October 25, 2023 08:23

mzr1996 merged commit ed5924b into open-mmlab:dev Oct 25, 2023
6 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Implement of RAM with a gradio interface #1802

[Feature] Implement of RAM with a gradio interface #1802

Coobiw commented Sep 25, 2023 •

edited

Loading

codecov bot commented Sep 25, 2023 •

edited

Loading

[Feature] Implement of RAM with a gradio interface #1802

[Feature] Implement of RAM with a gradio interface #1802

Conversation

Coobiw commented Sep 25, 2023 • edited Loading

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

codecov bot commented Sep 25, 2023 • edited Loading

Codecov Report

Coobiw commented Sep 25, 2023 •

edited

Loading

codecov bot commented Sep 25, 2023 •

edited

Loading