Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[SCRIPT][API] Add RoBERTa fine-tuning scripts, add BERTClassifier to API #931

Merged
merged 26 commits into from
Sep 24, 2019

Conversation

eric-haibin-lin
Copy link
Member

@eric-haibin-lin eric-haibin-lin commented Sep 16, 2019

Description

  • add roberta argument options to the finetune script
  • move BERTClassifier to the official API

@hhexiy

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@codecov
Copy link

codecov bot commented Sep 16, 2019

Codecov Report

Merging #931 into master will increase coverage by 0.8%.
The diff coverage is 36.36%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #931     +/-   ##
========================================
+ Coverage   88.89%   89.7%   +0.8%     
========================================
  Files          67      67             
  Lines        6360    6408     +48     
========================================
+ Hits         5654    5748     +94     
+ Misses        706     660     -46
Impacted Files Coverage Δ
src/gluonnlp/model/bert.py 84.95% <19.51%> (-14.51%) ⬇️
src/gluonnlp/data/transforms.py 76.92% <85.71%> (-4.68%) ⬇️
src/gluonnlp/model/parameter.py 92% <0%> (-8%) ⬇️
src/gluonnlp/data/corpora/wikitext.py 94.82% <0%> (-5.18%) ⬇️
src/gluonnlp/data/batchify/batchify.py 93.18% <0%> (-3.41%) ⬇️
src/gluonnlp/data/dataset.py 97.61% <0%> (-1.59%) ⬇️
src/gluonnlp/data/word_embedding_evaluation.py 96.21% <0%> (-0.76%) ⬇️
src/gluonnlp/vocab/subwords.py 86.95% <0%> (+2.17%) ⬆️
src/gluonnlp/model/sequence_sampler.py 91.63% <0%> (+17.07%) ⬆️
... and 1 more

@mli
Copy link
Member

mli commented Sep 16, 2019

Job PR-931/1 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/1/index.html

@mli
Copy link
Member

mli commented Sep 16, 2019

Job PR-931/2 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/2/index.html

@mli
Copy link
Member

mli commented Sep 16, 2019

Job PR-931/4 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/4/index.html

@kaonashi-tyc
Copy link

kaonashi-tyc commented Sep 19, 2019

RoBERTa model has a slightly different Classifier structure by default (assuming the fairseq as the official implementation):

https://github.com/pytorch/fairseq/blob/718677ebb044e27aaf1a30640c2f7ab6b8fa8509/fairseq/models/roberta/model.py#L218-L235

Might deserve its own Classifier of sort

@eric-haibin-lin eric-haibin-lin requested a review from a team as a code owner September 19, 2019 19:47
@eric-haibin-lin eric-haibin-lin changed the title [SCRIPT] Add RoBERTa fine-tuning scripts [SCRIPT][API] Add RoBERTa fine-tuning scripts, add BERTClassifier to API Sep 19, 2019
@mli
Copy link
Member

mli commented Sep 19, 2019

Job PR-931/9 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/9/index.html

@mli
Copy link
Member

mli commented Sep 19, 2019

Job PR-931/10 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/10/index.html

@mli
Copy link
Member

mli commented Sep 20, 2019

Job PR-931/11 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/11/index.html

@mli
Copy link
Member

mli commented Sep 20, 2019

Job PR-931/12 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/12/index.html

@szhengac
Copy link
Member

What is this RoBERT? Any reference?

@mli
Copy link
Member

mli commented Sep 20, 2019

Job PR-931/13 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/13/index.html

@eric-haibin-lin
Copy link
Member Author

@mli
Copy link
Member

mli commented Sep 22, 2019

Job PR-931/15 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/15/index.html

@mli
Copy link
Member

mli commented Sep 23, 2019

Job PR-931/16 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/16/index.html

src/gluonnlp/data/transforms.py Outdated Show resolved Hide resolved
src/gluonnlp/data/transforms.py Outdated Show resolved Hide resolved
@@ -1221,17 +1221,29 @@ class BERTSentenceTransform:
Tokenizer for the sentences.
max_seq_length : int.
Maximum sequence length of the sentences.
vocab : Vocab or BERTVocab
The vocabulary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear that different vocabularies are required/handled for different BERT style models. Let's document that cls_token, sep_token is used if available and otherwise fallback to bos_token, eos_token.

To formally specify the expected attributes ( cls_token, sep_token, etc.) one could (eventually) use Structural subtyping https://mypy.readthedocs.io/en/latest/protocols.html

Copy link

@kaonashi-tyc kaonashi-tyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

@mli
Copy link
Member

mli commented Sep 24, 2019

Job PR-931/17 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-931/17/index.html

@eric-haibin-lin eric-haibin-lin merged commit d63abb8 into dmlc:master Sep 24, 2019
@eric-haibin-lin eric-haibin-lin deleted the rob-finetune branch February 2, 2020 06:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants