run loop without validation. #536

DrClick · 2019-11-21T15:42:38Z

Is your feature request related to a problem? Please describe.
We would like to be able to tell the trainer to not run the validation loop at all even though we have defined it. We have many models that inherit from our base module and our team finds this useful often.

Describe the solution you'd like
A trainer bool flag for no_validation.

Describe alternatives you've considered
Setting validation percentage to zero. This seems less clean

Additional context
I have already implemented this and would like to submit a PR

DrClick · 2019-11-21T18:24:31Z

Additionally, the commit by @neggert


#from the commit in the Trainer init
...
early_stop_callback=True,
...

#from line 185
# creates a default one if none passed in
        if early_stop_callback is True:
            self.early_stop_callback = EarlyStopping(
                monitor='val_loss',
                patience=3,
                verbose=True,
                mode='min'
            )
            self.enable_early_stop = True
        elif not early_stop_callback:
            self.early_stop_callback = None
            self.enable_early_stop = False
        else:
            self.early_stop_callback = early_stop_callback
            self.enable_early_stop = True

seems to have unintended consequence. If you do not pass an argument for early_stopping, you would assume you don't want early stopping. Here, the default value True then sets up a default EarlyStopping callback. This directly effects this issue because validation is not being called, so there is no val_loss metric, and the training loop exits. I don't think the elif can even be reached here, (assuming you don't pass early_stop_callback=None to the trainer so I don't think this is consistent.

Is the intent of PL to provide out of the box early stopping? If so is patience 3 a great number?

The following is my current work around:

trainer = Trainer(
    gpus=hparams.gpus,
    distributed_backend=hparams.distributed_backend,
    use_amp=hparams.use_16bit,
    check_val_every_n_epoch=1,
    early_stop_callback=None,
    no_validation=True
)```

neggert · 2019-11-21T18:27:58Z

That's a question for @williamFalcon. IIRC, this commit was actually a cleanup of what was already there. I tend to agree with you that we shouldn't enable early stopping by default, but I understand the argument for it.

DrClick · 2019-11-21T18:36:03Z

If adding the feature for no_validation flag is something we want to add to the repo we should discuss if we want to address this default early stopping as it stands setting the no_validation will not work without also setting the early_stop_callback=None flag. This could be ok as long as its documented but it feels hacky

williamFalcon · 2019-11-25T11:32:21Z

A large portion of users don't need to deal with the complexity of setting up early stopping. 3 is pretty standard in research workflows. I think 90% of use cases want auto early-stopping with a sensible default. I'd say anything other than that is definitely an edge case where setting up your own schedule makes the most sense.

In terms of turning off validation, i'm just wondering why we need another flag and can't set

val_percent_check = 0

DrClick · 2019-11-25T15:21:31Z

Good morning @williamFalcon. This was my original thought but it seemed it would not work at first glance. Our workflow involves creating a base PL model on a project that handles the basic methods (for instance data loader setup, metrics gathering etc. and then iterating with different architectures and techniques. In this scenario we have defined a Val dataloader in the base class but specifically our users want to be able to run with no validation (time consuming) for a while and see what happens to quickly iterate. I thought this was more cleanly accomplished by setting a flag so that the validation loop is not called ever. (For instance in the sanity check in our example it could break if we send all the data via hyperperamters to the train method dataloader but our base class has defined a validation data loader). So from the principle of separation of concerns, it seems a bit “messy” to set the validation to zero.

It seems possible the larger problem is to deal cleanly with the already large parameter list. On one hand, although its “ugly” it is actually a super convenient way to set it up from an end user point of view. Using kwargs certainly is a mixed bag. It might be the right way to go with a lot of documentation. Certainly a config file dictionary could work as well. Maybe there could be a two trainers. The BasicTrainer which has a much simplified param list and Trainer takes a config or dictionary’s or kwargs.

Anyhow, I’m on vacation for the week and I will be single shortly if I continue to be on github. Thank you for this incredible project. I hope I can help contribute.

Borda · 2019-11-25T17:00:45Z

there is a discussion about parameters in #541

LanceKnight · 2022-10-08T14:08:13Z

val_percent_check = 0

Couldn't find this flag. Is it removed by newer updates?

bparaj · 2022-11-16T02:34:33Z

Looks like we need to set limit_val_batches to 0.0.
https://github.com/Lightning-AI/lightning/blob/e93c64964e416bde5dbe1893881180d4371883c3/src/pytorch_lightning/trainer/trainer.py#L1536

DrClick added feature Is an improvement or enhancement help wanted Open to be worked on labels Nov 21, 2019

DrClick mentioned this issue Nov 21, 2019

address ability to run without validation #539

Closed

4 tasks

kuynzereb mentioned this issue Nov 25, 2019

Check early stopping metric in the beginning of the training #542

Merged

neggert mentioned this issue Nov 25, 2019

Refactoring? #541

Closed

kuynzereb mentioned this issue Dec 21, 2019

Turn off validation if val_percent_check=0 #646

Closed

williamFalcon closed this as completed Jan 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run loop without validation. #536

run loop without validation. #536

DrClick commented Nov 21, 2019

DrClick commented Nov 21, 2019 •

edited

Loading

neggert commented Nov 21, 2019

DrClick commented Nov 21, 2019 •

edited

Loading

williamFalcon commented Nov 25, 2019 •

edited

Loading

DrClick commented Nov 25, 2019

Borda commented Nov 25, 2019

LanceKnight commented Oct 8, 2022

bparaj commented Nov 16, 2022

run loop without validation. #536

run loop without validation. #536

Comments

DrClick commented Nov 21, 2019

DrClick commented Nov 21, 2019 • edited Loading

neggert commented Nov 21, 2019

DrClick commented Nov 21, 2019 • edited Loading

williamFalcon commented Nov 25, 2019 • edited Loading

DrClick commented Nov 25, 2019

Borda commented Nov 25, 2019

LanceKnight commented Oct 8, 2022

bparaj commented Nov 16, 2022

DrClick commented Nov 21, 2019 •

edited

Loading

DrClick commented Nov 21, 2019 •

edited

Loading

williamFalcon commented Nov 25, 2019 •

edited

Loading