Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update transformer_tutorial.py #2363

Merged
merged 2 commits into from
May 31, 2023
Merged

Update transformer_tutorial.py #2363

merged 2 commits into from
May 31, 2023

Conversation

frasertajima
Copy link
Contributor

@frasertajima frasertajima commented May 31, 2023

Fix to #2111 "perhaps there is a misprint at line 40 "; original did not link file. Searched for file over the internet.

Review of referenced paper https://arxiv.org/pdf/1706.03762.pdf section 3.2.3 suggests (bold added):

"Similarly, self-attention layers in the decoder allow each position in the decoder to attend to all positions in the decoder up to and including that position. We need to prevent leftward information flow in the decoder to preserve the auto-regressive property. We implement this inside of scaled dot-product attention by masking out (setting to −∞) all values in the input of the softmax which correspond to illegal connections. See Figure 2."

Thus the suggested change in reference from nn.Transform.Encoder to nn.Transform.Decoder seems reasonable.

Fixes #2111

Description

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnessessary issues are included into this pull request.

cc @svekars @carljparker @pytorch/team-text-core @Nayef211

fix to "perhaps there is a misprint at line 40 pytorch#2111";

review of referenced paper https://arxiv.org/pdf/1706.03762.pdf section 3.2.3 suggests:
"Similarly, self-attention layers in the decoder allow each position in the decoder to attend to
all positions in the decoder up to and including that position. We need to prevent leftward
information flow in the decoder to preserve the auto-regressive property. We implement this
inside of scaled dot-product attention by masking out (setting to −∞) all values in the input
of the softmax which correspond to illegal connections. See Figure 2."

Thus the suggested change in reference from nn.Transform.Encoder to nn.Transform.Decoder seems reasonable.
@facebook-github-bot
Copy link
Contributor

Hi @frasertajima!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@netlify
Copy link

netlify bot commented May 31, 2023

Deploy Preview for pytorch-tutorials-preview ready!

Name Link
🔨 Latest commit a57a9d2
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-tutorials-preview/deploys/64778e29a9cfd10008555221
😎 Deploy Preview https://deploy-preview-2363--pytorch-tutorials-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions github-actions bot added grammar docathon-h1-2023 A label for the docathon in H1 2023 easy labels May 31, 2023
@svekars
Copy link
Contributor

svekars commented May 31, 2023

Please sign the CLA so we can review your PR.

@frasertajima
Copy link
Contributor Author

frasertajima commented May 31, 2023 via email

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copy link
Contributor

@Nayef211 Nayef211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

@svekars svekars merged commit 510f82e into pytorch:main May 31, 2023
10 checks passed
@frasertajima frasertajima deleted the patch-1 branch June 1, 2023 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docathon-h1-2023 A label for the docathon in H1 2023 easy grammar
Projects
None yet
Development

Successfully merging this pull request may close these issues.

perhaps there is a misprint at line 40
4 participants