Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade to PTL 1.7 #4672

Merged
merged 24 commits into from
Aug 11, 2022
Merged

upgrade to PTL 1.7 #4672

merged 24 commits into from
Aug 11, 2022

Conversation

nithinraok
Copy link
Collaborator

@nithinraok nithinraok commented Aug 3, 2022

Signed-off-by: nithinraok nithinrao.koluguri@gmail.com

What does this PR do?

Upgrade PTL version to 1.7.2

Collection: All

Changelog

  • Default max_steps is -1 now instead of None
  • DDPPlugin has been moved from plugins to strategies and renamed as DDPStrategy -> renamed NLPDDPPlugin to NLPDDPStrategy
  • override lightning module trainer property to make it compatible with our existing models, since it's been removed from PTL 1.7
  • Following args have been removed from Trainer
  • prepare_data_per_node
  • checkpoint_callback - replaced with enable_checkpointing
  • process_position - now part of TQDMprogressbar callback
  • stochastic_weight_avg - now part of callback
  • flush_logs_every_n_steps
  • weights_summary - replaced with callback
  • terminate_on_nan
  • log_gpu_memory - part of DeviceGPUStats callback
  • Progressbar refresh rate - part of TQDMProgressbar callback

TODO:

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

@nithinraok nithinraok force-pushed the upgrade_to_ptl_1.7 branch 4 times, most recently from c8b6c8f to 4622525 Compare August 10, 2022 17:37
@nithinraok nithinraok marked this pull request as ready for review August 10, 2022 18:42
@@ -848,6 +848,7 @@ def forward(
# Output. [sq, b, h]
# =================

# print(context_layer.device)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this print statement?

@@ -1469,3 +1473,33 @@ def on_train_batch_end(self, outputs, batch: Any, batch_idx: int, unused: int =
if batch_idx == self._nsys_profile_end_step and get_rank() in self._nsys_profile_ranks:
logging.info("====== End nsys profiling ======")
torch.cuda.cudart().cudaProfilerStop()

def cuda(self, device=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding a note that we need to remove this as soon as PTL 7.2 is out with the fix.

ericharper
ericharper previously approved these changes Aug 10, 2022
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
…lback now

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
ericharper and others added 5 commits August 10, 2022 16:06
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving.

@ericharper ericharper merged commit 4cd9b34 into main Aug 11, 2022
@ericharper ericharper deleted the upgrade_to_ptl_1.7 branch August 11, 2022 15:20
PeganovAnton pushed a commit that referenced this pull request Aug 24, 2022
* upgrade to PTL 1.7

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* min version

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* replace progressbar_refresh_rate with enable progressbar, this is callback now

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* progressbar

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* replace removed PTL 1.7 args, fix cpu tests, remove p-tune older script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* revert ssl test fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* override trainer property and fix numba grad check

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* NLPDDPlugin -> NLPDDPStrategy

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* set max_steps default as -1

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix maxsteps in notebooks

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update trainer config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix speech2label jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix speech2text jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* DDPPlugin -> DDPStrategy

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* remove provided strategy keys from trainer config nlp

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* check other examples

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* override LightningModule .cuda call to maintain pytorch default behavior

Signed-off-by: ericharper <complex451@gmail.com>

* revert gpt eval jenkins test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* overwrite cuda class to PTL

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* review feedback

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* remove checkpoint callback from main config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* patch fix for intentslot classification test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: ericharper <complex451@gmail.com>
piraka9011 pushed a commit to piraka9011/NeMo that referenced this pull request Aug 25, 2022
* upgrade to PTL 1.7

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* min version

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* replace progressbar_refresh_rate with enable progressbar, this is callback now

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* progressbar

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* replace removed PTL 1.7 args, fix cpu tests, remove p-tune older script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* revert ssl test fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* override trainer property and fix numba grad check

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* NLPDDPlugin -> NLPDDPStrategy

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* set max_steps default as -1

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix maxsteps in notebooks

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update trainer config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix speech2label jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix speech2text jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* DDPPlugin -> DDPStrategy

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* remove provided strategy keys from trainer config nlp

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* check other examples

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* override LightningModule .cuda call to maintain pytorch default behavior

Signed-off-by: ericharper <complex451@gmail.com>

* revert gpt eval jenkins test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* overwrite cuda class to PTL

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* review feedback

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* remove checkpoint callback from main config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* patch fix for intentslot classification test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: ericharper <complex451@gmail.com>
Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* upgrade to PTL 1.7

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* min version

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* replace progressbar_refresh_rate with enable progressbar, this is callback now

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* progressbar

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* replace removed PTL 1.7 args, fix cpu tests, remove p-tune older script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* revert ssl test fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* override trainer property and fix numba grad check

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* NLPDDPlugin -> NLPDDPStrategy

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* set max_steps default as -1

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix maxsteps in notebooks

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update trainer config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix speech2label jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fix speech2text jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* DDPPlugin -> DDPStrategy

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* remove provided strategy keys from trainer config nlp

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* check other examples

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* override LightningModule .cuda call to maintain pytorch default behavior

Signed-off-by: ericharper <complex451@gmail.com>

* revert gpt eval jenkins test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* overwrite cuda class to PTL

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* review feedback

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* remove checkpoint callback from main config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* patch fix for intentslot classification test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: ericharper <complex451@gmail.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants