Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto_scale_batch_size not working with datamodule #3233

Closed
carmocca opened this issue Aug 27, 2020 · 11 comments · Fixed by #3266
Closed

auto_scale_batch_size not working with datamodule #3233

carmocca opened this issue Aug 27, 2020 · 11 comments · Fixed by #3266
Assignees
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@carmocca
Copy link
Contributor

carmocca commented Aug 27, 2020

🐛 Bug

The Trainer expects the LightningModule to have self.batch_size (see scale_batch_size() in training_tricks.py). However, if one is using the new LightningDataModule, that should be the class with self.batch_size defined.

To Reproduce

assert hasattr(lightning_data_module, "batch_size")
trainer = Trainer(auto_scale_batch_size=True)
trainer.fit(lightning_module, datamodule=lightning_data_module)
pytorch_lightning.utilities.exceptions.MisconfigurationException: Field batch_size not found in both `model` and `model.hparams`

Expected behavior

auto_scale_batch_size should work using LightningDataModule

Environment

* Packages:
	- numpy:             1.18.5
	- pyTorch_debug:     False
	- pyTorch_version:   1.6.0
	- pytorch-lightning: 0.9.1rc1
	- tensorboard:       2.2.0
	- tqdm:              4.48.2
@carmocca carmocca added bug Something isn't working help wanted Open to be worked on labels Aug 27, 2020
@awaelchli
Copy link
Member

I knew this bug report will eventually come :) same will happen for auto_lr_find.
We need to generalize our lightning_hasattr and getattr helper functions to include the datamodule. In total, we have

model.attribute_name
model.hparams.attribute_name
model.dm.attribute_name

all of these should be considered by both lr_find and auto_scale_batch_size

@awaelchli awaelchli self-assigned this Aug 27, 2020
@tugot17
Copy link

tugot17 commented Aug 28, 2020

@awaelchli So currently there is no solution to this issue?

@awaelchli
Copy link
Member

no, you need to set it manually, sorry.
But I'll try to make a PR this weekend if not today. Should be a relatively easy fix unless I missed something.

@SkafteNicki
Copy link
Member

@awaelchli I completely agree that it should be a simple fix, since all the attribute getting/setting is handled by our own lightning_getattr ect functions.

@awaelchli
Copy link
Member

awaelchli commented Sep 3, 2020

@carmocca As you may already know, we fixed this. Just a note in case you didn't see it, you now need to call .tune() instead of .fit():

trainer.tune(lightning_module, datamodule=lightning_data_module)

This is to better distinguish the training from the tuning step. However, it may be subject to change since there are some refactors happening right now.

@tugot17
Copy link

tugot17 commented Sep 4, 2020

@awaelchli
In which version tune will be officially introduced?

@awaelchli
Copy link
Member

If it stays, in v1.0

@carmocca
Copy link
Contributor Author

carmocca commented Sep 5, 2020

I am not familiar with the tuning interface. Few questions:

Right now, I can use auto_scale_batch_size by passing the option to the trainer and calling fit. This does the auto scale procedure and then starts training

If I understand correctly, when tune is released the following will be the necessary:

should_auto_scale_bs = # comes from the user
trainer = Trainer(auto_scale_batch_size=should_auto_scale_bs)

if should_auto_scale_bs:
    trainer.tune(...)
trainer.fit(...)

Im assuming tune doesnt run fit automatically when it is finished.

What would happen if the code ran trainer.tune(...) outside of the if and auto_scale_batch_size was False?
Also, shouldn't auto_scale_batch_size be a parameter of tune instead of Trainer? (Maybe it is, I just don't know where is the tune discussion).

@awaelchli
Copy link
Member

I do not know how to answer these questions, @williamFalcon added the tune method so it would be best to ask him how he sees it being used in the future.

@SkafteNicki
Copy link
Member

@carmocca I think the process in the future will be:

trainer = Trainer(auto_scale_batch_size = should_auto_scale_bs)
trainer.tune(model) # will do nothing if should_auto_scale_bs=False
trainer.fit(model)

it should still be easy for the user to use these features. The moving of the tuning from fit to tune is to disentangle the hyperparameter tuning from the actual optimization of the network. This will make it easier for us in the future to implement more tuning algorithms.

@RemonComputer
Copy link

@awaelchli , I have the same issue happening with me, my code is:

 # Reading and intilaizing the Trainer
    trainer_config = config.pop('trainer_config')
    trainer = Trainer(
        **trainer_config,
        callbacks=callbacks,
        logger=loggers,
        checkpoint_callback=checkpoint_callback,
        auto_scale_batch_size=(dataset_config['batch_size'] is None)
    )

    # optimizing batch size if batch_size is none
    if dataset_config['batch_size'] is None:
        print("Batch Size is None attempting to tune batch size")
        # tuner = Tuner(trainer)
        # optimal_batch_size = tuner.scale_batch_size(
        #     model, mode='power',
        #     batch_arg_name='batch_size',
        #     datamodule=dataset)
        optimal_batch_size = trainer.tune(model, datamodule=dataset)
        print(f"Found best batch size to be: {optimal_batch_size}")
        dataset.batch_size = optimal_batch_size

I tracked the issue and I think there is a problem in logic from parsing.py Line186 to parsing.py Line193

The problem is that I have a hparams attribute in my model (I don't know where that came from), but it doesn't contain a batch size attribute, the batch_size is an attribute that is contained in the datamodule, if this if condition is executed then I think there would not be a problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
5 participants