Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nlp refactoring #316

Merged
merged 58 commits into from
Feb 4, 2020
Merged

nlp refactoring #316

merged 58 commits into from
Feb 4, 2020

Conversation

VahidooX
Copy link
Collaborator

@VahidooX VahidooX commented Jan 31, 2020

New PR for #295 that does not touch all files.
Changes to the nemo nlp collection:
++updated the organization of the files, functions, and classes
++**names of all the nlp related files are updated to more appropriate names
++**data_layers.py is split into separate files for each data layer
++**all utils and functions related to datasets are moved into datasets folder
++**all tokenizers are organized into a folder under datasets
++added copyright headers to all files in the collection
++replaced all prints with logging
++fixed and adopted new style
++updated all the files and fixed the imports to avoid polluting the namespace with import *
++merged with nemo refactoring updates

yzhang123 and others added 7 commits January 28, 2020 16:33
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 14 alerts and fixes 29 when merging fdc421b into 1fddca2 - view on LGTM.com

new alerts:

  • 7 for 'import *' may pollute namespace
  • 4 for Unused import
  • 2 for Unused local variable
  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 14 for Unused import
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Unused local variable
  • 2 for Implicit string concatenation in a list

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 14 alerts and fixes 29 when merging 4f81260 into 1fddca2 - view on LGTM.com

new alerts:

  • 7 for 'import *' may pollute namespace
  • 4 for Unused import
  • 2 for Unused local variable
  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 14 for Unused import
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Unused local variable
  • 2 for Implicit string concatenation in a list

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 11 alerts and fixes 29 when merging 0864e65 into 1fddca2 - view on LGTM.com

new alerts:

  • 7 for 'import *' may pollute namespace
  • 2 for Unused local variable
  • 1 for Unused import
  • 1 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 14 for Unused import
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Unused local variable
  • 2 for Implicit string concatenation in a list

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 9 alerts and fixes 29 when merging 44bd381 into 1fddca2 - view on LGTM.com

new alerts:

  • 7 for 'import *' may pollute namespace
  • 2 for Unused local variable

fixed alerts:

  • 14 for Unused import
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Unused local variable
  • 2 for Implicit string concatenation in a list

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 10 alerts and fixes 29 when merging b3c1513 into 1fddca2 - view on LGTM.com

new alerts:

  • 8 for 'import *' may pollute namespace
  • 2 for Unused local variable

fixed alerts:

  • 14 for Unused import
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Unused local variable
  • 2 for Implicit string concatenation in a list

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@VahidooX VahidooX changed the title nlp refactoring [WIP] nlp refactoring Jan 31, 2020
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>

# Conflicts:
#	scripts/export_bert_to_trt.py
#	tests/asr/test_asr.py
#	tests/asr/test_weight_share.py
#	tests/asr/test_zeroDS.py
#	tests/test_pytorch_trainers.py
#	tests/tts/test_tts.py
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 5 alerts and fixes 32 when merging 2137704 into 4137c2b - view on LGTM.com

new alerts:

  • 5 for 'import *' may pollute namespace

fixed alerts:

  • 15 for Unused import
  • 5 for Unused local variable
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 2 for Implicit string concatenation in a list

nemo/collections/nlp/callbacks/qa_squad_callback.py Outdated Show resolved Hide resolved
nemo/collections/nlp/data/tokenizers/bert_tokenizer.py Outdated Show resolved Hide resolved
nemo/collections/nlp/data/tokenizers/char_tokenizer.py Outdated Show resolved Hide resolved
nemo/collections/nlp/utils/callback_utils.py Outdated Show resolved Hide resolved
nemo/collections/nlp/utils/loss_utils.py Outdated Show resolved Hide resolved
@blisc
Copy link
Collaborator

blisc commented Jan 31, 2020

It would be very helpful if you can add copyright headers on all new files

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Jan 31, 2020

This pull request introduces 5 alerts and fixes 32 when merging b8f57bf into 4137c2b - view on LGTM.com

new alerts:

  • 5 for 'import *' may pollute namespace

fixed alerts:

  • 15 for Unused import
  • 5 for Unused local variable
  • 5 for 'import *' may pollute namespace
  • 5 for Module is imported with 'import' and 'import from'
  • 2 for Implicit string concatenation in a list

Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 4, 2020

This pull request introduces 8 alerts and fixes 36 when merging 720f5c5 into b4aba85 - view on LGTM.com

new alerts:

  • 5 for 'import *' may pollute namespace
  • 3 for Signature mismatch in overriding method

fixed alerts:

  • 15 for Unused import
  • 6 for 'import *' may pollute namespace
  • 5 for Unused local variable
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Signature mismatch in overriding method
  • 2 for Implicit string concatenation in a list

@tkornuta-nvidia
Copy link
Contributor

Standardize:

  1. nemo_nlp.nm.trainables.huggingface.BERT vs nemo_nlp.nm.trainables.common.huggingface.BERT
  2. nemo_nlp.nm.trainables.huggingface.BERT.list_pretrained_models() vs nemo_nlp.BERT.list_pretrained_models() vs nemo_nlp.huggingface.BERT.list_pretrained_models() vs (those are in the comments of many files)

@okuchaiev
Copy link
Member

  1. Should be: "nemo_nlp.nm.trainables.huggingface.BERT" not "nemo_nlp.nm.trainables.common.huggingface.BERT".

  2. list_pretrained_models() is a method all Neural Modules have https://github.com/NVIDIA/NeMo/blob/master/nemo/core/neural_modules.py#L330 so there is nothing to standardize in @tkornuta-nvidia point (2) above

blisc
blisc previously approved these changes Feb 4, 2020
@ekmb
Copy link
Collaborator

ekmb commented Feb 4, 2020

please don't merge yet, some of the examples are broken

@okuchaiev okuchaiev requested a review from ekmb February 4, 2020 17:25
@okuchaiev
Copy link
Member

@ekmb you can request changes to block merging

Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 4, 2020

This pull request introduces 24 alerts and fixes 36 when merging 04a083b into b4aba85 - view on LGTM.com

new alerts:

  • 7 for Wrong number of arguments in a call
  • 7 for Wrong number of arguments in a class instantiation
  • 5 for 'import *' may pollute namespace
  • 3 for Signature mismatch in overriding method
  • 2 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 15 for Unused import
  • 6 for 'import *' may pollute namespace
  • 5 for Unused local variable
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Signature mismatch in overriding method
  • 2 for Implicit string concatenation in a list

ekmb and others added 8 commits February 4, 2020 10:58
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
Signed-off-by: Evelina Bakhturina <ebakhturina@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 4, 2020

This pull request introduces 10 alerts and fixes 36 when merging 21f353b into b4aba85 - view on LGTM.com

new alerts:

  • 5 for 'import *' may pollute namespace
  • 3 for Signature mismatch in overriding method
  • 2 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 15 for Unused import
  • 6 for 'import *' may pollute namespace
  • 5 for Unused local variable
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Signature mismatch in overriding method
  • 2 for Implicit string concatenation in a list

yzhang123 and others added 3 commits February 4, 2020 11:29
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: VahidooX <vnoroozi@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 4, 2020

This pull request introduces 10 alerts and fixes 36 when merging 4cbcfd4 into b4aba85 - view on LGTM.com

new alerts:

  • 5 for 'import *' may pollute namespace
  • 3 for Signature mismatch in overriding method
  • 2 for Module is imported with 'import' and 'import from'

fixed alerts:

  • 15 for Unused import
  • 6 for 'import *' may pollute namespace
  • 5 for Unused local variable
  • 5 for Module is imported with 'import' and 'import from'
  • 3 for Signature mismatch in overriding method
  • 2 for Implicit string concatenation in a list

@ekmb ekmb changed the title [WIP] nlp refactoring nlp refactoring Feb 4, 2020
@VahidooX VahidooX merged commit 42a5f20 into master Feb 4, 2020
@VahidooX VahidooX deleted the nlp_refactoring_tmp branch February 4, 2020 23:08
dcurran90 pushed a commit to dcurran90/NeMo that referenced this pull request Oct 15, 2024
* Fix formatting and linting

Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>

* Update after review

Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>

* Update after review

Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>

---------

Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants