This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 538
[REFACTOR] Refactor TokenEmbedding to reduce number of places that initialize internals #750
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
|
Codecov Report
@@ Coverage Diff @@
## master #750 +/- ##
=========================================
- Coverage 90.61% 81.31% -9.3%
=========================================
Files 64 64
Lines 6064 6113 +49
=========================================
- Hits 5495 4971 -524
- Misses 569 1142 +573
|
leezu
force-pushed
the
refactortokenembinternals
branch
from
June 5, 2019 13:31
90da77d
to
814de50
Compare
Job PR-750/2 is complete. |
leezu
force-pushed
the
refactortokenembinternals
branch
from
June 6, 2019 18:51
814de50
to
1976a60
Compare
Job PR-750/3 is complete. |
szha
approved these changes
Jun 8, 2019
…ternals Prior to this commit, the TokenEmbedding constructor could only construct an empty TokenEmbedding. However, an empty TokenEmbedding is of little use, thus there exist a variety of places that modify and overwrite TokenEmbedding internals after construction to "fill" the idx_to_token, idx_to_vec. Examples are _load_embedding_text or _load_embedding_serialized within the TokenEmbedding class, but also set_embedding in the Vocab class. This commits 1) makes these methods static and changes them to return the idx_to_token and idx_to_vec, 2) extends the TokenEmbedding constructor to allow constructing a "non-empty" TokenEmbedding given newly added idx_to_token and idx_to_vec arguments This change is backwards compatible in that it does not change any public API besides introducing idx_to_token and idx_to_vec arguments to TokenEmbedding. For the future, the "empty" TokenEmbedding initialization may be removed as it provides little benefit. For now it is kept for backwards compatibility.
leezu
force-pushed
the
refactortokenembinternals
branch
from
June 9, 2019 13:07
1976a60
to
a5a684c
Compare
leezu
force-pushed
the
refactortokenembinternals
branch
from
June 10, 2019 13:18
a5a684c
to
25d0ee8
Compare
Job PR-750/5 is complete. |
leezu
force-pushed
the
refactortokenembinternals
branch
from
June 10, 2019 21:44
1a67b04
to
c0243be
Compare
eric-haibin-lin
approved these changes
Jun 10, 2019
leezu
force-pushed
the
refactortokenembinternals
branch
from
June 10, 2019 23:18
c0243be
to
7fdd0bb
Compare
Job PR-750/9 is complete. |
4 tasks
This was referenced Aug 28, 2019
Merged
leezu
added a commit
to leezu/gluon-nlp
that referenced
this pull request
Aug 29, 2019
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Prior to this commit, the TokenEmbedding constructor could only construct an
empty TokenEmbedding. However, an empty TokenEmbedding is of little use, thus
there exist a variety of places that modify and overwrite TokenEmbedding
internals after construction to "fill" the idx_to_token, idx_to_vec.
Examples are _load_embedding_text or _load_embedding_serialized.
This commits
idx_to_token and idx_to_vec,
TokenEmbedding given newly added idx_to_token and idx_to_vec arguments
This change is backwards compatible in that it does not change any public API
besides introducing idx_to_token and idx_to_vec arguments to TokenEmbedding. For
the future, the "empty" TokenEmbedding initialization may be removed as it
provides little benefit. For now it is kept for backwards compatibility.
Checklist
Essentials
Changes
idx_to_token
andidx_to_vec
arguments toTokenEmbedding
Comments
This extends work of #732. Only the last commit is related to this PR.