Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in scale_batch_size #2523

Closed
wants to merge 5 commits into from
Closed

Fix bugs in scale_batch_size #2523

wants to merge 5 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jul 6, 2020

Fixes #2484

  • Set model.batch_size based on model.hparams.batch_size if not defined
  • Changed init_val for batch_size to 0 so that it starts with user defined batch_size instead of 2 all the time

- Set model.batch_size based on model.hparams.batch_size if not defined
- Changed init_val for batch_size to 0 so that it starts with user defined batch_size instead of 2 all the time
@pep8speaks
Copy link

pep8speaks commented Jul 6, 2020

Hello @x2-l! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-07-06 13:07:55 UTC

@mergify mergify bot requested a review from a team July 6, 2020 05:58
@codecov
Copy link

codecov bot commented Jul 6, 2020

Codecov Report

Merging #2523 into master will increase coverage by 1%.
The diff coverage is 33%.

@@          Coverage Diff           @@
##           master   #2523   +/-   ##
======================================
+ Coverage      88%     89%   +1%     
======================================
  Files          69      69           
  Lines        5628    5641   +13     
======================================
+ Hits         4963    5017   +54     
+ Misses        665     624   -41     

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
@SkafteNicki mind check?

@Borda Borda added the bug Something isn't working label Jul 6, 2020
@mergify mergify bot requested a review from a team July 6, 2020 07:26
@SkafteNicki
Copy link
Member

I am not sure this will solve your problem. Since it is only model.batch_size that gets updated during the scaling, if you have defined your dataloaders using Dataloader(dataset, batch_size=self.hparams.batch_size) then this will have no effect.

@ghost
Copy link
Author

ghost commented Jul 6, 2020

this should be handled by user no? its the same as enabling auto_lr_find=True but the user sets Optimizer(self.parameters(), lr=self.hparams.learning_rate). or maybe scale_batch_size should consider overwritting DataLoader.batch_size if it is called?

@justusschock
Copy link
Member

Maybe we also need to check, if self.hparams is a mapping, because then you can't use setattr

@SkafteNicki
Copy link
Member

@x2-l DataLoader.batch_size cannot be overwritten after it is initialized, already tried it because that would be the easiest solution.
@justusschock agree, in PR #1998 I have implemented three functions as counterparts to setattr, hasattr and getattr that basically search for attribute in this order:

  1. attribute of model i.e. model.batch_size
  2. key of model.hparams i.e. `model.hparams['batch_size']
  3. attribute of model.hparams i.e. model.hparams.batch_size

@ghost
Copy link
Author

ghost commented Jul 6, 2020

@SkafteNicki what bout reinitializing DataLoader with the new batch_size like

def replace_batch_size(self, dataloader, batch_size):
    skip_keys = ['batch_size']
    dl_args = {
        k: v for k, v in dataloader.__dict__.items() if not k.startswith('_') and k not in skip_keys
    }

    dl_args['batch_size'] = batch_size
    dataloader = type(dataloader)(**dl_args)
    return dataloader

@SkafteNicki
Copy link
Member

@x2-l agree that should work

@awaelchli
Copy link
Member

awaelchli commented Jul 6, 2020

@x2-l I think there is already code like this for replacing the sampler/shuffle attributes. One could factor this out and create a general function
replace_dataloader_attr(dataloader: DataLoader, name: str, new_value: Any) like you proposed and avoid duplicated code.

@Borda
Copy link
Member

Borda commented Jul 6, 2020

I agree that this is quite a monkey patch and it would be better to support 'hparams.batch_size' as batch_arg_name

@ghost
Copy link
Author

ghost commented Jul 6, 2020

@SkafteNicki @Borda should I still update this? or can be updated together at PR #1998?

@ghost
Copy link
Author

ghost commented Jul 6, 2020

on second thought, why not just raise an warning that model.batch_size will be used when both model.batch_size and model.hparams.batch_size are set? if only model.hparams.batch_size is set, the attribute batch_size will be created for model.

the scale_batch_size function will then update both params so that it'll be consistent no matter what is being used in the rest of the code. same can be done for the learning_rate. when these functions are called the original values are just the initial values that will be replaced after the search. this can also avoid the undesired scenarios when self.hparams.batch_size and self.hparams.lr are used in defining the data loaders and optimizers and the values are not updated with the searched values.

@Borda @SkafteNicki @awaelchli thoughts?

@justusschock
Copy link
Member

While I see your point, we should be careful about that. E.g. when you reinstantiate your model and overwrite the wrong parameter, you introduce unwanted behaviour somewhere else.

I think I'd prefer not to implicitly introduce concurring new arguments of not necessary

@Borda
Copy link
Member

Borda commented Jul 6, 2020

I would say, do this patch/fix here because it may take some time before PR #1998 lands...
also, this may be 0.8.x release compare to PR #1998 is aimed for 0.9

@Borda Borda added the discussion In a discussion stage label Jul 6, 2020
Enable scaling of batch size using hparams and removed setting the attribute on model. If both are specified, an warning is raised.
@ghost
Copy link
Author

ghost commented Jul 6, 2020

I enabled scaling of batch size using model.hparams.batch_arg_name and removed setting the .batch_arg_name attribute on the model. If both are specified, a warning is raised telling user that model.batch_arg_name will be used as initial value for the search. If it's undesired, user can stop and change accordingly. I think that'll be enough for the bug fix while not introducing anything new?

@Borda Borda changed the title Fix bugs in scale_batch_size [blocked by #2223] Fix bugs in scale_batch_size Jul 15, 2020
@Borda
Copy link
Member

Borda commented Jul 15, 2020

seems taking parts from #2223 so let's merge this after it...

@mdgoldberg
Copy link

Hello! Will this and/or #2223 also fix auto_lr_find? It has the same issue.

@mergify
Copy link
Contributor

mergify bot commented Jul 29, 2020

This pull request is now in conflict... :(

@Borda Borda changed the title [blocked by #2223] Fix bugs in scale_batch_size Fix bugs in scale_batch_size Jul 29, 2020
@Borda
Copy link
Member

Borda commented Jul 29, 2020

@ghost mind finish this one as #2223 have been merged...

@Borda Borda added this to the 0.9.0 milestone Aug 6, 2020
@awaelchli
Copy link
Member

Hey! recently a similar PR/issue was fixed by @SkafteNicki in #2821. Could we use the lightning_attr api he added there to implement these checks here? They look very similar.
And we need to keep in mind the compatibility with datamodules, but perhaps this should be done in a different PR :)

@SkafteNicki
Copy link
Member

Agree with @awaelchli that the first issue addressed in this PR is the same issue the learning rate finder had. If you could exchange all hasattr, getattr, setattr with the corresponding function from this file https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/utilities/parsing.py#L145-L195 then it would be great :)

@awaelchli
Copy link
Member

It looks like this user and their repository completely vanished from github. Should we copy these changes over into a new PR and continue as we discussed?

@Borda
Copy link
Member

Borda commented Aug 17, 2020

It looks like this user and their repository completely vanished from github. Should we copy these changes over into a new PR and continue as we discussed?

he behaves as his nick says :D
I would say, check if we can push/rebase this PR, if yes, let's continue if not let's make the same branch in our repo, and merge this PR into the new branch and continue there...

@awaelchli awaelchli changed the base branch from master to bugfix/batch-size-scaler-attr August 19, 2020 10:41
williamFalcon added a commit that referenced this pull request Aug 19, 2020
…2523) (#3043)

* lightning attr fix

* revert refactor

* create test

* separate test

* changelog update

* tests

* revert

* Update pytorch_lightning/trainer/training_tricks.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
@edenlightning edenlightning modified the milestones: 0.9.0, 0.9.x Aug 20, 2020
@Borda Borda closed this Aug 20, 2020
@Borda Borda modified the milestones: 0.9.x, 0.9.0 Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discussion In a discussion stage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trainer.scale_batch_size requires model.batch_size instead of model.hparams.batch_size
7 participants