Early Stopping behavior #1751

marcopodda · 2020-05-07T06:12:51Z

Hi there,
thanks for the great library (I am using 0.7.5.). I am not following the bug report template as I'm not sure this is indeed a bug, or simply I cannot understand how early stopping is implemented. My code looks as follows:

    early_stop_callback = EarlyStopping(
        monitor='val_acc',
        min_delta=0.0,
        patience=80,
        verbose=True,
        mode=self.mode
    )

    trainer = Trainer(
        early_stop_callback=early_stop_callback,
        auto_select_gpus=True,
        max_epochs=200,
        terminate_on_nan=True,
        show_progress_bar=True,
        fast_dev_run=False,
        gpus=1
    )

As I understand it, the model should perform early stopping after AT LEAST 80 epochs have passed without improvement on the validation accuracy. However, in my case, early stopping happened at epoch 75. Is this how it should be?

As I said, I am not sure this is actually a bug or a choice (perhaps early stopping is implemented at the batch level?). If it is indeed a bug, I will work a reproducible example. Thank you!

The text was updated successfully, but these errors were encountered:

github-actions · 2020-05-07T06:13:44Z

Hi! thanks for your contribution!, great first issue!

devforfu · 2020-05-07T08:07:26Z

I would expect that it should iterate for at least 80 epochs, too. So to me, it looks like a bug or some kind of unexpected behavior. Would be great to figure it out!

marcopodda · 2020-05-07T09:54:37Z

Ok then, I'll work out some notebook to see if I can reproduce.

marcopodda · 2020-05-07T14:16:58Z

Thanks @mateuszpieniak
Here is a working example. As you can see, it stops at epoch 41 even though patience is set to 80.
https://github.com/marcopodda/pl-es-example/blob/master/ES%20example.ipynb

elkotito · 2020-05-07T16:00:54Z

It is definitely a bug. I discovered that EarlyStopping.on_epoch_end is executed twice within one epoch, meaning that patience=160 should solve your issue temporarily.

In the file training_loop.py:
First call:

            if self.fast_dev_run or should_check_val:
                self.run_evaluation(test_mode=self.testing)
                self.call_checkpoint_callback()
                self.call_early_stop_callback()

Second call:

                # TODO wrap this logic into the callback
                if self.enable_early_stop:
                    if (met_min_epochs and met_min_steps) or self.fast_dev_run:
                        should_stop = self.early_stop_callback.on_epoch_end(self, self.get_model())
                        # stop training
                        stop = should_stop and met_min_epochs
                        if stop:
                            self.run_training_teardown()
                            return

Anjum48 · 2020-05-08T09:45:12Z

I upgraded to the bleeding edge version yesterday and can confirm that this started happening to me too. I didn't have an issue before I upgraded (I think I was on 0.7.3 before?)

ricpruss · 2020-05-10T23:57:11Z

Yep we ran into this as well. It is called once in trainer and once in the on epoch end callback.

Borda · 2020-05-11T20:45:55Z

@Anjum48 @ricpruss mind send a fix, PR?

elkotito · 2020-05-11T21:44:26Z

@Borda Well, I would love to make my first PL's PR if that's okay? 😉

Borda · 2020-05-11T21:48:58Z

@mateuszpieniak sure go ahead! 🚀

marcopodda added bug Something isn't working help wanted Open to be worked on labels May 7, 2020

Borda assigned elkotito May 11, 2020

elkotito mentioned this issue May 17, 2020

Removing unecessary early stopping calls #1863

Merged

5 tasks

williamFalcon closed this as completed in #1863 May 26, 2020

HansBambel mentioned this issue Jun 2, 2020

Early Stopping stops too early when using SLURM #2038

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Early Stopping behavior #1751

Early Stopping behavior #1751

marcopodda commented May 7, 2020 •

edited

Loading

github-actions bot commented May 7, 2020

devforfu commented May 7, 2020

marcopodda commented May 7, 2020

marcopodda commented May 7, 2020

elkotito commented May 7, 2020 •

edited

Loading

Anjum48 commented May 8, 2020 •

edited

Loading

ricpruss commented May 10, 2020

Borda commented May 11, 2020

elkotito commented May 11, 2020

Borda commented May 11, 2020

Early Stopping behavior #1751

Early Stopping behavior #1751

Comments

marcopodda commented May 7, 2020 • edited Loading

github-actions bot commented May 7, 2020

devforfu commented May 7, 2020

marcopodda commented May 7, 2020

marcopodda commented May 7, 2020

elkotito commented May 7, 2020 • edited Loading

Anjum48 commented May 8, 2020 • edited Loading

ricpruss commented May 10, 2020

Borda commented May 11, 2020

elkotito commented May 11, 2020

Borda commented May 11, 2020

marcopodda commented May 7, 2020 •

edited

Loading

elkotito commented May 7, 2020 •

edited

Loading

Anjum48 commented May 8, 2020 •

edited

Loading