Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

Commit

Permalink
Add beit transformer models (#511)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #511

as title

Reviewed By: QuentinDuval

Differential Revision: D33793945

fbshipit-source-id: 3e664fb7699beb04d012039e930149e8eb4b7617
  • Loading branch information
prigoyal authored and facebook-github-bot committed Feb 1, 2022
1 parent 7337369 commit 722a7cc
Show file tree
Hide file tree
Showing 3 changed files with 520 additions and 0 deletions.
28 changes: 28 additions & 0 deletions vissl/config/defaults.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -692,6 +692,34 @@ config:
QKV_BIAS: True # Bias for QKV in attention layers.
QK_SCALE: False # Scale

# ------------------------------------------------------------- #
# BEiT.
# https://github.com/microsoft/unilm/blob/master/beit/modeling_finetune.py
# https://arxiv.org/pdf/2106.08254.pdf
# ------------------------------------------------------------- #
BEIT:
IMAGE_SIZE: 224
PATCH_SIZE: 16
NUM_LAYERS: 12
NUM_HEADS: 12
HIDDEN_DIM: 768
MLP_RATIO: 4.0
# MLP and projection layer dropout rate
DROPOUT_RATE: 0
# Attention dropout rate
ATTENTION_DROPOUT_RATE: 0
# Stochastic depth dropout rate. Turning on stochastic depth and
# using aggressive augmentation is essentially the difference
# between a DeiT and a ViT.
DROP_PATH_RATE: 0
QKV_BIAS: False # Bias for QKV in attention layers.
QK_SCALE: False # Scale
USE_ABS_POS_EMB: True
USE_REL_POS_BIAS: False
USE_SHARED_REL_POS_BIAS: False
USE_MEAN_POOLING: True
INIT_VALUES: False

# ------------------------------------------------------------- #
# Parameters unique to the ConViT and not used for standard vision
# transformers
Expand Down
3 changes: 3 additions & 0 deletions vissl/models/model_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -649,6 +649,9 @@ def __init__(self, drop_prob=None):
def forward(self, x):
return drop_path(x, self.drop_prob, self.training)

def extra_repr(self) -> str:
return "p={}".format(self.drop_prob)


to_1tuple = _ntuple(1)
to_2tuple = _ntuple(2)
Expand Down
Loading

0 comments on commit 722a7cc

Please sign in to comment.