HiViT (ICLR2023, notable-top-25%)

This is the official implementation of the paper HiViT: A Simple and More Efficient Design of Hierarchical Vision Transformer.

Results

Model	Pretraining data	ImageNet-1K	COCO Det	ADE Seg
MAE-base	ImageNet-1K	83.6	51.2	48.1
SimMIM-base	ImageNet-1K	84.0	52.3	52.8
HiViT-base	ImageNet-1K	84.6	53.3	52.8

Pre-training Models

mae_hivit_base_1600ep.pth

mae_hivit_base_1600ep_ft100ep.pth

Usage

1. Supervised learning on ImageNet-1K.: See supervised/get_started.md for a quick start.

2. Self-supervised learning on ImageNet-1K.: See self_supervised/get_started.md.

3. Object detection: See detection/get_started.md.

4. Semantic segmentation: See segmentation/get_started.md.

Bibtex

Please consider citing our paper in your publications if the project helps your research.

@inproceedings{zhanghivit,
  title={HiViT: A Simpler and More Efficient Design of Hierarchical Vision Transformer},
  author={Zhang, Xiaosong and Tian, Yunjie and Xie, Lingxi and Huang, Wei and Dai, Qi and Ye, Qixiang and Tian, Qi},
  booktitle={International Conference on Learning Representations},
  year={2023},
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
detection		detection
segmentation		segmentation
self_supervised		self_supervised
supervised		supervised
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hivit.png		hivit.png
install_apex.sh		install_apex.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiViT (ICLR2023, notable-top-25%)

Results

Pre-training Models

Usage

Bibtex

About

Releases

Packages

Languages

License

zhangxiaosong18/hivit

Folders and files

Latest commit

History

Repository files navigation

HiViT (ICLR2023, notable-top-25%)

Results

Pre-training Models

Usage

Bibtex

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages