Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[Enhancement] Mix precision support for BERT finetuning #793

Merged
merged 9 commits into from
Jun 27, 2019

Conversation

eric-haibin-lin
Copy link
Member

Description

  • clean up and update documentation
  • add dtype option to the finetuning script, using AMP

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@eric-haibin-lin
Copy link
Member Author

@ptrendx FYI

@mli
Copy link
Member

mli commented Jun 24, 2019

Job PR-793/1 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-793/1/index.html

@codecov
Copy link

codecov bot commented Jun 25, 2019

Codecov Report

Merging #793 into master will decrease coverage by 0.01%.
The diff coverage is 77.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #793      +/-   ##
==========================================
- Coverage   90.61%   90.59%   -0.02%     
==========================================
  Files          64       64              
  Lines        6295     6303       +8     
==========================================
+ Hits         5704     5710       +6     
- Misses        591      593       +2
Impacted Files Coverage Δ
src/gluonnlp/model/attention_cell.py 94.8% <77.77%> (-1.09%) ⬇️

@mli
Copy link
Member

mli commented Jun 25, 2019

Job PR-793/2 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-793/2/index.html

@mli
Copy link
Member

mli commented Jun 25, 2019

Job PR-793/3 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-793/3/index.html

@mli
Copy link
Member

mli commented Jun 26, 2019

Job PR-793/4 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-793/4/index.html

@mli
Copy link
Member

mli commented Jun 26, 2019

Job PR-793/5 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-793/5/index.html

@eric-haibin-lin eric-haibin-lin added the release focus Progress focus for release label Jun 26, 2019
@eric-haibin-lin eric-haibin-lin merged commit 7f20127 into dmlc:master Jun 27, 2019
@eric-haibin-lin
Copy link
Member Author

@ptrendx i think there're multiple places where the range of the scalar (eps, or negative value like this) to an op may be too large or too small for fp16. Is there a better way to fix/truncate them in AMP instead of user's code?

@eric-haibin-lin eric-haibin-lin deleted the fp16-finetune branch February 2, 2020 06:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release focus Progress focus for release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants