auto_scale_batch_size not working with datamodule #3233

carmocca · 2020-08-27T21:56:50Z

🐛 Bug

The Trainer expects the LightningModule to have self.batch_size (see scale_batch_size() in training_tricks.py). However, if one is using the new LightningDataModule, that should be the class with self.batch_size defined.

To Reproduce

assert hasattr(lightning_data_module, "batch_size")
trainer = Trainer(auto_scale_batch_size=True)
trainer.fit(lightning_module, datamodule=lightning_data_module)

pytorch_lightning.utilities.exceptions.MisconfigurationException: Field batch_size not found in both `model` and `model.hparams`

Expected behavior

auto_scale_batch_size should work using LightningDataModule

Environment

* Packages:
	- numpy:             1.18.5
	- pyTorch_debug:     False
	- pyTorch_version:   1.6.0
	- pytorch-lightning: 0.9.1rc1
	- tensorboard:       2.2.0
	- tqdm:              4.48.2

The text was updated successfully, but these errors were encountered:

awaelchli · 2020-08-27T23:26:56Z

I knew this bug report will eventually come :) same will happen for auto_lr_find.
We need to generalize our lightning_hasattr and getattr helper functions to include the datamodule. In total, we have

model.attribute_name
model.hparams.attribute_name
model.dm.attribute_name

all of these should be considered by both lr_find and auto_scale_batch_size

tugot17 · 2020-08-28T14:56:05Z

@awaelchli So currently there is no solution to this issue?

awaelchli · 2020-08-28T15:00:09Z

no, you need to set it manually, sorry.
But I'll try to make a PR this weekend if not today. Should be a relatively easy fix unless I missed something.

SkafteNicki · 2020-08-28T15:22:29Z

@awaelchli I completely agree that it should be a simple fix, since all the attribute getting/setting is handled by our own lightning_getattr ect functions.

awaelchli · 2020-09-03T20:47:58Z

@carmocca As you may already know, we fixed this. Just a note in case you didn't see it, you now need to call .tune() instead of .fit():

trainer.tune(lightning_module, datamodule=lightning_data_module)

This is to better distinguish the training from the tuning step. However, it may be subject to change since there are some refactors happening right now.

tugot17 · 2020-09-04T07:38:41Z

@awaelchli
In which version tune will be officially introduced?

awaelchli · 2020-09-04T21:01:02Z

If it stays, in v1.0

carmocca · 2020-09-05T02:01:15Z

I am not familiar with the tuning interface. Few questions:

Right now, I can use auto_scale_batch_size by passing the option to the trainer and calling fit. This does the auto scale procedure and then starts training

If I understand correctly, when tune is released the following will be the necessary:

should_auto_scale_bs = # comes from the user
trainer = Trainer(auto_scale_batch_size=should_auto_scale_bs)

if should_auto_scale_bs:
    trainer.tune(...)
trainer.fit(...)

Im assuming tune doesnt run fit automatically when it is finished.

What would happen if the code ran trainer.tune(...) outside of the if and auto_scale_batch_size was False?
Also, shouldn't auto_scale_batch_size be a parameter of tune instead of Trainer? (Maybe it is, I just don't know where is the tune discussion).

awaelchli · 2020-09-05T03:33:33Z

I do not know how to answer these questions, @williamFalcon added the tune method so it would be best to ask him how he sees it being used in the future.

SkafteNicki · 2020-09-05T09:34:06Z

@carmocca I think the process in the future will be:

trainer = Trainer(auto_scale_batch_size = should_auto_scale_bs)
trainer.tune(model) # will do nothing if should_auto_scale_bs=False
trainer.fit(model)

it should still be easy for the user to use these features. The moving of the tuning from fit to tune is to disentangle the hyperparameter tuning from the actual optimization of the network. This will make it easier for us in the future to implement more tuning algorithms.

RemonComputer · 2020-10-23T21:03:21Z

@awaelchli , I have the same issue happening with me, my code is:

 # Reading and intilaizing the Trainer
    trainer_config = config.pop('trainer_config')
    trainer = Trainer(
        **trainer_config,
        callbacks=callbacks,
        logger=loggers,
        checkpoint_callback=checkpoint_callback,
        auto_scale_batch_size=(dataset_config['batch_size'] is None)
    )

    # optimizing batch size if batch_size is none
    if dataset_config['batch_size'] is None:
        print("Batch Size is None attempting to tune batch size")
        # tuner = Tuner(trainer)
        # optimal_batch_size = tuner.scale_batch_size(
        #     model, mode='power',
        #     batch_arg_name='batch_size',
        #     datamodule=dataset)
        optimal_batch_size = trainer.tune(model, datamodule=dataset)
        print(f"Found best batch size to be: {optimal_batch_size}")
        dataset.batch_size = optimal_batch_size

I tracked the issue and I think there is a problem in logic from parsing.py Line186 to parsing.py Line193

The problem is that I have a hparams attribute in my model (I don't know where that came from), but it doesn't contain a batch size attribute, the batch_size is an attribute that is contained in the datamodule, if this if condition is executed then I think there would not be a problem

carmocca added bug Something isn't working help wanted Open to be worked on labels Aug 27, 2020

awaelchli self-assigned this Aug 27, 2020

awaelchli mentioned this issue Aug 30, 2020

update batch size in LightningModule.datamodule when auto scaling batch size #3266

Merged

7 tasks

awaelchli closed this as completed in #3266 Sep 3, 2020

rnett mentioned this issue Nov 6, 2020

Attribute finder doesn't check datamodule when hparams is set #4558

Closed

julianikulski mentioned this issue Feb 7, 2021

Hyperparameters julianikulski/data-science-resources#10

Open

PaulLerner mentioned this issue Oct 24, 2022

Make auto_scale_batch_size error message more informative #15276

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto_scale_batch_size not working with datamodule #3233

auto_scale_batch_size not working with datamodule #3233

carmocca commented Aug 27, 2020 •

edited

Loading

awaelchli commented Aug 27, 2020

tugot17 commented Aug 28, 2020

awaelchli commented Aug 28, 2020

SkafteNicki commented Aug 28, 2020

awaelchli commented Sep 3, 2020 •

edited

Loading

tugot17 commented Sep 4, 2020 •

edited

Loading

awaelchli commented Sep 4, 2020

carmocca commented Sep 5, 2020

awaelchli commented Sep 5, 2020

SkafteNicki commented Sep 5, 2020

RemonComputer commented Oct 23, 2020

auto_scale_batch_size not working with datamodule #3233

auto_scale_batch_size not working with datamodule #3233

Comments

carmocca commented Aug 27, 2020 • edited Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

awaelchli commented Aug 27, 2020

tugot17 commented Aug 28, 2020

awaelchli commented Aug 28, 2020

SkafteNicki commented Aug 28, 2020

awaelchli commented Sep 3, 2020 • edited Loading

tugot17 commented Sep 4, 2020 • edited Loading

awaelchli commented Sep 4, 2020

carmocca commented Sep 5, 2020

awaelchli commented Sep 5, 2020

SkafteNicki commented Sep 5, 2020

RemonComputer commented Oct 23, 2020

carmocca commented Aug 27, 2020 •

edited

Loading

awaelchli commented Sep 3, 2020 •

edited

Loading

tugot17 commented Sep 4, 2020 •

edited

Loading