Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[DEV] Deprecate specifying Vocab padding, bos and eos_token as positional arguments #945

Merged
merged 10 commits into from
Oct 21, 2019

Conversation

leezu
Copy link
Contributor

@leezu leezu commented Sep 24, 2019

Description

Deprecate specification as positional arguments in preparation for dropping explicit padding_token, bos_token and eos_token arguments in favor of the general kwargs handling of #732.

nlp.Vocab({'a':10}, None, 1, '<unk>', 'PAD')  # deprecated and will be unsupported
nlp.Vocab({'a':10}, None, 1, '<unk>', padding_token='PAD')  # recommended use

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Deprecate specifying Vocab padding, bos and eos_token as positional arguments

Comments

  • We need to consider if we want to deprecate the current default values of padding, bos and eos_token as well in the future. Ie. if Vocab(counter) should still create Vocab(size=5, unk="<unk>", reserved="['<pad>', '<bos>', '<eos>']") or maybe Vocab(size=2, unk="<unk>", reserved="[]") The advantage is that there is nothing special about these 3 special tokens compared to other application specific special tokens; removing the default values simplifies our API. The disadvantage is that such change would affect the vast majority of users and most lines of code that instantiates Vocab. (Note, this is not part of this PR).

cc @dmlc/gluon-nlp-team

@leezu leezu requested a review from a team as a code owner September 24, 2019 20:31
@codecov
Copy link

codecov bot commented Sep 24, 2019

Codecov Report

Merging #945 into master will decrease coverage by 0.02%.
The diff coverage is 80%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #945      +/-   ##
==========================================
- Coverage   89.83%   89.81%   -0.03%     
==========================================
  Files          67       67              
  Lines        6364     6371       +7     
==========================================
+ Hits         5717     5722       +5     
- Misses        647      649       +2
Impacted Files Coverage Δ
src/gluonnlp/vocab/vocab.py 96.39% <80%> (-0.94%) ⬇️

@szha
Copy link
Member

szha commented Oct 4, 2019

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)

@leezu leezu changed the title Deprecate specifying Vocab padding, bos and eos_token as positional arguments [DEV] Deprecate specifying Vocab padding, bos and eos_token as positional arguments Oct 4, 2019
@szha szha added the release focus Progress focus for release label Oct 8, 2019
@mli
Copy link
Member

mli commented Oct 14, 2019

Job PR-945/8 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-945/8/index.html

@mli
Copy link
Member

mli commented Oct 15, 2019

Job PR-945/10 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-945/10/index.html

@mli
Copy link
Member

mli commented Oct 18, 2019

Job PR-945/11 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-945/11/index.html

@szha szha merged commit 76ca4d7 into dmlc:master Oct 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release focus Progress focus for release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants