Skip to content

Latest commit

 

History

History
60 lines (51 loc) · 4.23 KB

File metadata and controls

60 lines (51 loc) · 4.23 KB

DeiT

  • Paper:Training data-efficient image transformers & distillation through attention

  • Origin Repo:facebookresearch/deit

  • Code:deit.py

  • Evaluate Transforms:

    # backend: pil
    # input_size: 224x224
    transforms = T.Compose([
        T.Resize(248, interpolation='bicubic'),
        T.CenterCrop(224),
        T.ToTensor(),
        T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
    
    # backend: pil
    # input_size: 384x384
    transforms = T.Compose([
        T.Resize(384, interpolation='bicubic'),
        T.CenterCrop(384),
        T.ToTensor(),
        T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
  • Model Details:

    Model Model Name Params (M) FLOPs (G) Top-1 (%) Top-5 (%) Pretrained Model
    DeiT-tiny deit_ti 5.7 1.1 72.18 91.11 Download
    DeiT-small deit_s 22.0 4.2 79.85 95.04 Download
    DeiT-base deit_b 86.4 16.8 81.99 95.74 Download
    DeiT-tiny distilled deit_ti_distilled 5.9 1.1 74.50 91.89 Download
    DeiT-small distilled deit_s_distilled 22.4 4.3 81.22 95.39 Download
    DeiT-base distilled deit_b_distilled 87.2 16.9 83.39 96.49 Download
    DeiT-base 384 deit_b_384 86.4 49.3 83.10 96.37 Download
    DeiT-base distilled 384 deit_b_distilled_384 87.2 49.4 85.43 97.33 Download
  • Citation:

    @article{touvron2020deit,
        title = {Training data-efficient image transformers & distillation through attention},
        author = {Hugo Touvron and Matthieu Cord and Matthijs Douze and Francisco Massa and Alexandre Sablayrolles and Herv'e J'egou},
        journal = {arXiv preprint arXiv:2012.12877},
        year = {2020}
    }