Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sdpa to ViT #29325

Closed
wants to merge 5 commits into from
Closed

add sdpa to ViT #29325

wants to merge 5 commits into from

Conversation

lyaronskaya
Copy link
Contributor

What does this PR do?

Adding support for SDPA to ViT. Fixes #28005

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed.
@ArthurZucker @fxmarty

Copy link
Contributor

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Can you run (preferably on a GPU):

RUN_SLOW=1 pytest tests/models/vit -k "test_eager_matches_sdpa_inference" -s -vvvvv

and report the result here?

@fxmarty
Copy link
Contributor

fxmarty commented Feb 27, 2024

Can you install ruff==0.1.5 and run make style & make fix-copies as well?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@lyaronskaya
Copy link
Contributor Author

@fxmarty Hi!

Can you run (preferably on a GPU):

RUN_SLOW=1 pytest tests/models/vit -k "test_eager_matches_sdpa_inference" -s -vvvvv

and report the result here?

CPU results

collected 223 items / 220 deselected / 3 selected                                                                                                                                                          

tests/models/vit/test_modeling_vit.py::ViTModelTest::test_eager_matches_sdpa_inference_0_float16 <- tests/test_modeling_common.py SKIPPED (float16 not supported on cpu)
tests/models/vit/test_modeling_vit.py::ViTModelTest::test_eager_matches_sdpa_inference_1_bfloat16 <- tests/test_modeling_common.py PASSED
tests/models/vit/test_modeling_vit.py::ViTModelTest::test_eager_matches_sdpa_inference_2_float32 <- tests/test_modeling_common.py PASSED`

Can you install ruff==0.1.5 and run make style & make fix-copies as well?

Result of make fix-copies has inconsistencies, so I still have to check the fix-copies code.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One way is either to remove the copied from and use another model as the base, or add sdpa to all of them.
I am down to add it to all of them and have @NielsRogge check once it's done!

self.attention = ASTAttention(config)
self.attention = VIT_ATTENTION_CLASSES[config._attn_implementation](config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix copies is fixing this but VIT_ATTENTION_CLASSES does not exist

self.attention = DeiTAttention(config)
self.attention = VIT_ATTENTION_CLASSES[config._attn_implementation](config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

self.attention = VideoMAEAttention(config)
self.attention = VIT_ATTENTION_CLASSES[config._attn_implementation](config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

self.attention = YolosAttention(config)
self.attention = VIT_ATTENTION_CLASSES[config._attn_implementation](config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

self.attention = ViTMSNAttention(config)
self.attention = VIT_ATTENTION_CLASSES[config._attn_implementation](config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@lyaronskaya
Copy link
Contributor Author

@ArthurZucker @fxmarty Just added sdpa to all of these models that use ViT as base.
All tests passed except for one

FAILED tests/models/videomae/test_modeling_videomae.py::VideoMAEModelTest::test_eager_matches_sdpa_inference_1_bfloat16 - RuntimeError: "mse_cpu" not implemented for 'BFloat16'

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few changes are still to be removed!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, sorry for the late answer!
Since you added _supports_sdpa slow test should be run!
Could you try RUN_SLOW=1 pytest tests/models/ with the changed models? 🤗

@ArthurZucker
Copy link
Collaborator

Can you also rebase on main to make sure CI is full green!

@hyenal
Copy link
Contributor

hyenal commented Apr 27, 2024

Is there any update on this PR ? I d be happy to help with the remaining tasks (rebase + running the tests) if needs be :)

@huggingface huggingface deleted a comment from github-actions bot Apr 29, 2024
@amyeroberts
Copy link
Collaborator

Hi @hyenal, as this PR hasn't had any activity for over a month, feel free to open another PR with these changes and ping us for review when ready!

@hyenal
Copy link
Contributor

hyenal commented Apr 29, 2024

Hi @hyenal, as this PR hasn't had any activity for over a month, feel free to open another PR with these changes and ping us for review when ready!

Thanks a lot for the reply, I d like to give the author a chance to reply to my message before submitting a new PR (even adding me as a contributor to the repo would work @lyaronskaya ).
Otherwise I will submit the PR by the end of the week :)

@lyaronskaya
Copy link
Contributor Author

Hi @hyenal! I’m little busy, and you can take it over. I appreciate your help

amyeroberts pushed a commit that referenced this pull request May 16, 2024
remove blank line (+1 squashed commit)
Squashed commits:
[24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
Squashed commits:
[08bd27e] [run-slow]vit_msn,vision_encoder_decoder
[ec96a8d] [run-slow]vit_msn
[ead817e] fix vit msn multi gpu
[d12cdc8] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[3fdbfa8] doc
[a3ff33e] finish implementation
[e20b7b7] Update test_modeling_common.py
[e290c58] Update test_modeling_flax_common.py
[d3af86f] comment
[ff7dd32] more comments
[59b1378] suggestion
[7e2ba6d] attn_implementation as attribute of the class
[fe66ab7] minor
[38642b5] Apply suggestions from code review

Accept comments

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[22cde7d] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[48e137c] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[99f4c67] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[96cf20a] Update src/transformers/models/vit_msn/modeling_vit_msn.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[c59377d] Update src/transformers/models/vit_mae/modeling_vit_mae.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[b70a472] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[00c84d2] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[61f00eb] all tests are passing locally
[e9e0b82] vision encoder/decoder
[4d5076b] test-vision (+20 squashed commits)
Squashed commits:
[d1add8db9] yolo
[9fde65716] fix flax
[986566c28] minor
[ca2f21d1f] vit
[3333efd7a] easy models change
[ebfc214] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[b8b8603] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[48ecc7e] all tests are passing locally
[bff7fc3] minor
[62f8830] fix yolo and text_encoder tests
[1215075] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1064cae] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[b7f52ff] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[cffaa10] fix-copies
[ef6c511] test vit hybrid
[7d4ba86] vit hybrid
[66f9190] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1fcc0a0] fixes
[cfde6eb] fixup
[e77df1e] all except yolo end encoder decoder (+17 squashed commits)
Squashed commits:
[602913e] vit + vit_mae are working
[547f6c4] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
[61a97df] it s the complete opposite...
[aefab37] fix more tests
[71802a1] fix all torch tests
[40b12eb] encoder - decoder tests
[941552b] slow decorator where appropriate
[14d055d] has_attentions to yolo and msn
[3381fa1] add correct name
[e261316] repo consistency
[31c6d0c] fixup
[9d21427] minor fix
[11ed2e1] chore
[eca6644] add sdpa to vit-based models
[cffbf39] make fix-copies result
[6468319] fix style
[d324cd0] add sdpa for vit

Co-authored-by: Liubov Yaronskaya <luba.yaronskaya@gmail.com>
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@amyeroberts
Copy link
Collaborator

Closing as superseded by #30555

itazap pushed a commit that referenced this pull request May 24, 2024
remove blank line (+1 squashed commit)
Squashed commits:
[24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
Squashed commits:
[08bd27e] [run-slow]vit_msn,vision_encoder_decoder
[ec96a8d] [run-slow]vit_msn
[ead817e] fix vit msn multi gpu
[d12cdc8] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[3fdbfa8] doc
[a3ff33e] finish implementation
[e20b7b7] Update test_modeling_common.py
[e290c58] Update test_modeling_flax_common.py
[d3af86f] comment
[ff7dd32] more comments
[59b1378] suggestion
[7e2ba6d] attn_implementation as attribute of the class
[fe66ab7] minor
[38642b5] Apply suggestions from code review

Accept comments

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[22cde7d] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[48e137c] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[99f4c67] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[96cf20a] Update src/transformers/models/vit_msn/modeling_vit_msn.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[c59377d] Update src/transformers/models/vit_mae/modeling_vit_mae.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[b70a472] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[00c84d2] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[61f00eb] all tests are passing locally
[e9e0b82] vision encoder/decoder
[4d5076b] test-vision (+20 squashed commits)
Squashed commits:
[d1add8db9] yolo
[9fde65716] fix flax
[986566c28] minor
[ca2f21d1f] vit
[3333efd7a] easy models change
[ebfc214] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[b8b8603] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[48ecc7e] all tests are passing locally
[bff7fc3] minor
[62f8830] fix yolo and text_encoder tests
[1215075] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1064cae] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[b7f52ff] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[cffaa10] fix-copies
[ef6c511] test vit hybrid
[7d4ba86] vit hybrid
[66f9190] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1fcc0a0] fixes
[cfde6eb] fixup
[e77df1e] all except yolo end encoder decoder (+17 squashed commits)
Squashed commits:
[602913e] vit + vit_mae are working
[547f6c4] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
[61a97df] it s the complete opposite...
[aefab37] fix more tests
[71802a1] fix all torch tests
[40b12eb] encoder - decoder tests
[941552b] slow decorator where appropriate
[14d055d] has_attentions to yolo and msn
[3381fa1] add correct name
[e261316] repo consistency
[31c6d0c] fixup
[9d21427] minor fix
[11ed2e1] chore
[eca6644] add sdpa to vit-based models
[cffbf39] make fix-copies result
[6468319] fix style
[d324cd0] add sdpa for vit

Co-authored-by: Liubov Yaronskaya <luba.yaronskaya@gmail.com>
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024
remove blank line (+1 squashed commit)
Squashed commits:
[24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
Squashed commits:
[08bd27e] [run-slow]vit_msn,vision_encoder_decoder
[ec96a8d] [run-slow]vit_msn
[ead817e] fix vit msn multi gpu
[d12cdc8] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[3fdbfa8] doc
[a3ff33e] finish implementation
[e20b7b7] Update test_modeling_common.py
[e290c58] Update test_modeling_flax_common.py
[d3af86f] comment
[ff7dd32] more comments
[59b1378] suggestion
[7e2ba6d] attn_implementation as attribute of the class
[fe66ab7] minor
[38642b5] Apply suggestions from code review

Accept comments

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[22cde7d] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[48e137c] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[99f4c67] Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[96cf20a] Update src/transformers/models/vit_msn/modeling_vit_msn.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[c59377d] Update src/transformers/models/vit_mae/modeling_vit_mae.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[b70a472] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[00c84d2] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[61f00eb] all tests are passing locally
[e9e0b82] vision encoder/decoder
[4d5076b] test-vision (+20 squashed commits)
Squashed commits:
[d1add8db9] yolo
[9fde65716] fix flax
[986566c28] minor
[ca2f21d1f] vit
[3333efd7a] easy models change
[ebfc214] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[b8b8603] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[48ecc7e] all tests are passing locally
[bff7fc3] minor
[62f8830] fix yolo and text_encoder tests
[1215075] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1064cae] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[b7f52ff] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[cffaa10] fix-copies
[ef6c511] test vit hybrid
[7d4ba86] vit hybrid
[66f9190] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1fcc0a0] fixes
[cfde6eb] fixup
[e77df1e] all except yolo end encoder decoder (+17 squashed commits)
Squashed commits:
[602913e] vit + vit_mae are working
[547f6c4] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/  passes
[61a97df] it s the complete opposite...
[aefab37] fix more tests
[71802a1] fix all torch tests
[40b12eb] encoder - decoder tests
[941552b] slow decorator where appropriate
[14d055d] has_attentions to yolo and msn
[3381fa1] add correct name
[e261316] repo consistency
[31c6d0c] fixup
[9d21427] minor fix
[11ed2e1] chore
[eca6644] add sdpa to vit-based models
[cffbf39] make fix-copies result
[6468319] fix style
[d324cd0] add sdpa for vit

Co-authored-by: Liubov Yaronskaya <luba.yaronskaya@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Open to contribution: adding torch.nn.functional.scaled_dot_product_attention support for more architectures
6 participants