Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[BUGFIX] avoid using dict for attention cell parameter creation #1050

Merged
merged 1 commit into from
Dec 16, 2019

Conversation

eric-haibin-lin
Copy link
Member

Description

To circumvent apache/mxnet#17056 which causes bert distributed training to diverge

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

cc @dmlc/gluon-nlp-team

@eric-haibin-lin eric-haibin-lin requested a review from a team as a code owner December 13, 2019 22:38
@codecov
Copy link

codecov bot commented Dec 13, 2019

Codecov Report

Merging #1050 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1050      +/-   ##
==========================================
+ Coverage   88.24%   88.27%   +0.03%     
==========================================
  Files          67       67              
  Lines        6252     6252              
==========================================
+ Hits         5517     5519       +2     
+ Misses        735      733       -2
Impacted Files Coverage Δ
src/gluonnlp/model/attention_cell.py 96.22% <100%> (+0.62%) ⬆️
src/gluonnlp/model/transformer.py 91.63% <0%> (+0.32%) ⬆️

@eric-haibin-lin eric-haibin-lin added the release focus Progress focus for release label Dec 13, 2019
@mli
Copy link
Member

mli commented Dec 13, 2019

Job PR-1050/1 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1050/1/index.html

@leezu
Copy link
Contributor

leezu commented Dec 16, 2019

Would the non-deterministic parameter order only happen with Python 3.5 and lower? Or did it happen on Python 3.6+?

@leezu leezu merged commit 394e69a into dmlc:master Dec 16, 2019
@eric-haibin-lin eric-haibin-lin deleted the attn-fix branch February 2, 2020 06:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release focus Progress focus for release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants