Make GPT2Model a HybridBlock #1010

leezu · 2019-11-15T12:22:46Z

Description

Fixes #993
Fixes #1015

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

Changes

Make GPT2Model a HybridBlock

Comments

cc @dmlc/gluon-nlp-team @gigasquid

codecov · 2019-11-15T12:22:49Z

Codecov Report

Merging #1010 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1010   +/-   ##
=======================================
  Coverage   89.93%   89.93%           
=======================================
  Files          67       67           
  Lines        6340     6340           
=======================================
  Hits         5702     5702           
  Misses        638      638

mli · 2019-11-15T12:57:56Z

Job PR-1010/3 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1010/3/index.html

gigasquid · 2019-11-15T15:19:54Z

Thanks so much for helping with this @leezu.

I pulled the branch and tried to run the sequence sampling for gpt2 and got an error. I think it might actually be a problem on master with some refactoring (or I'm doing something wrong):

10:10 $ python3  sequence_sampling.py  random-sample  --bos 'Deep learning and natural language processing'   --lm-model gpt2_345m 
Namespace(beam_size=5, bos='Deep learning and natural language processing', command='random-sample', gpu=0, lm_model='gpt2_345m', max_length=20, print_num=3, temperature=1.0, use_top_k=None)
Traceback (most recent call last):
  File "sequence_sampling.py", line 187, in <module>
    generate()
  File "sequence_sampling.py", line 146, in generate
    decoder, vocab = get_decoder_vocab(args.lm_model)
  File "sequence_sampling.py", line 116, in get_decoder_vocab
    ctx=ctx)
  File "/Users/cmeier/workspace/deep-learning/gluon-nlp/scripts/text_generation/model/__init__.py", line 64, in get_model
    return models[name](**kwargs)
  File "/Users/cmeier/workspace/deep-learning/gluon-nlp/scripts/text_generation/model/gpt.py", line 383, in gpt2_345m
    **kwargs)
  File "/Users/cmeier/workspace/deep-learning/gluon-nlp/scripts/text_generation/model/gpt.py", line 429, in _get_gpt2_model
    **kwargs)
  File "/Users/cmeier/workspace/deep-learning/gluon-nlp/scripts/text_generation/model/gpt.py", line 239, in __init__
    units=units, hidden_size=units * 4, prefix='ffn{}_'.format(i)))
  File "/Users/cmeier/workspace/deep-learning/gluon-nlp/scripts/text_generation/model/gpt.py", line 186, in __init__
    self._act = GELU(approximate=True)
  File "/usr/local/lib/python3.7/site-packages/gluonnlp/model/block.py", line 106, in __init__
    super(GELU, self).__init__(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'approximate'

The error does not occur if I go back to git checkout 4e555394baf557e5e55e1ae24a2147b03dce2213

leezu · 2019-11-15T23:37:12Z

You need to use the development version of gluonnlp if you use the development version of the script. Use pip install git+https://github.com/dmlc/gluon-nlp.git

mli · 2019-11-18T07:22:10Z

Job PR-1010/4 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1010/4/index.html

scripts/text_generation/model/gpt.py

mli · 2019-11-20T07:48:46Z

Job PR-1010/8 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1010/8/index.html

leezu requested a review from sxjscience November 15, 2019 12:22

leezu requested a review from a team as a code owner November 15, 2019 12:22

leezu force-pushed the hybridgpt2 branch 2 times, most recently from 5c87d6d to 9c3c32e Compare November 15, 2019 12:24

leezu mentioned this pull request Nov 15, 2019

Export for GPT-2 #993

Closed

leezu added 4 commits November 18, 2019 04:18

Make GPT2Model a HybridBlock

c56d8d5

Enable gpt2 test for sequence_sampling.py

1f61987

Fix

d22ca95

Hybridize

3c929b3

leezu force-pushed the hybridgpt2 branch from 9c3c32e to 3c929b3 Compare November 18, 2019 06:47

Workaround apache/mxnet#16851

f51b33b

leezu mentioned this pull request Nov 20, 2019

prev_len in gpt.py #1015

Closed

sxjscience reviewed Nov 20, 2019

View reviewed changes

scripts/text_generation/model/gpt.py Outdated Show resolved Hide resolved

leezu added 2 commits November 20, 2019 06:07

Fix dmlc#1015

86bd080

Enable test

a93bbf4

leezu force-pushed the hybridgpt2 branch from 74071f7 to a93bbf4 Compare November 20, 2019 06:07

Ignore warning about package resolution using __spec__ or __package__

cca82b7

sxjscience approved these changes Nov 20, 2019

View reviewed changes

leezu merged commit ebfc920 into dmlc:master Nov 20, 2019

leezu deleted the hybridgpt2 branch November 20, 2019 08:20

leezu mentioned this pull request Dec 9, 2019

Optimize Inference Performance on CPU #1035

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make GPT2Model a HybridBlock #1010

Make GPT2Model a HybridBlock #1010

leezu commented Nov 15, 2019 •

edited

Loading

codecov bot commented Nov 15, 2019 •

edited

Loading

mli commented Nov 15, 2019

gigasquid commented Nov 15, 2019 •

edited

Loading

leezu commented Nov 15, 2019

mli commented Nov 18, 2019

mli commented Nov 20, 2019

Make GPT2Model a HybridBlock #1010

Make GPT2Model a HybridBlock #1010

Conversation

leezu commented Nov 15, 2019 • edited Loading

Description

Checklist

Essentials

Changes

Comments

codecov bot commented Nov 15, 2019 • edited Loading

Codecov Report

mli commented Nov 15, 2019

gigasquid commented Nov 15, 2019 • edited Loading

leezu commented Nov 15, 2019

mli commented Nov 18, 2019

mli commented Nov 20, 2019

leezu commented Nov 15, 2019 •

edited

Loading

codecov bot commented Nov 15, 2019 •

edited

Loading

gigasquid commented Nov 15, 2019 •

edited

Loading