You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, I intend to use Early Stopping on train_step and training metrics. There were some problems with this (early stopping being called twice in the training loop, not stopping at all when using 'min' mode, not stopping when having no validation, a missing return in the callback class). Those were fixed quickly, but I have some problems with current master still and in #1458 early stopping on training metrics seems to have been disabled, if I understand it correctly. This is also in the 0.8.0-dev documentation. But changing where it is being called is possible.
My question is, will Early Stopping on training metrics be possible going forward? Will calling an Early Stopping subclass in on_train_end catch training metrics and stop training depending on them?
Also, I don't know if I should create another bug report for my current problem with early stopping before #1504 has been merged, which might fix it. I have not changed my code to using a subclass of EarlyStopping, but edited the EarlyStopping class to return self._run_early_stopping_check(trainer, pl_module) in def on_validation_end(self, trainer, pl_module): (which is going to be in #1504 anyway if I understand correctly). Early stopping seems to work now (despite not having a val_step...) but it stops too early again, not before patience has been reached, but clearly before it should.
with hparams.patience=150 and hparams.min_delta=0.01, but this happens (epoch/mean_absolute_loss is the mean of all batch/mean_absolute_loss of an epoch, logged in on_epoch_end):
Way too early (provided I understand the expected behavior right), is it not?
The text was updated successfully, but these errors were encountered:
About current behavior on master, I seem to be able to stop training early on training metrics despite #1458, so that functionality is still there right now, correct? Any idea why my training stops this early?
❓ Questions and Help
What is your question?
So, I intend to use Early Stopping on train_step and training metrics. There were some problems with this (early stopping being called twice in the training loop, not stopping at all when using 'min' mode, not stopping when having no validation, a missing return in the callback class). Those were fixed quickly, but I have some problems with current master still and in #1458 early stopping on training metrics seems to have been disabled, if I understand it correctly. This is also in the 0.8.0-dev documentation. But changing where it is being called is possible.
My question is, will Early Stopping on training metrics be possible going forward? Will calling an Early Stopping subclass in on_train_end catch training metrics and stop training depending on them?
Also, I don't know if I should create another bug report for my current problem with early stopping before #1504 has been merged, which might fix it. I have not changed my code to using a subclass of EarlyStopping, but edited the EarlyStopping class to return
self._run_early_stopping_check(trainer, pl_module)
indef on_validation_end(self, trainer, pl_module):
(which is going to be in #1504 anyway if I understand correctly). Early stopping seems to work now (despite not having a val_step...) but it stops too early again, not before patience has been reached, but clearly before it should.Code
with hparams.patience=150 and hparams.min_delta=0.01, but this happens (
epoch/mean_absolute_loss
is the mean of allbatch/mean_absolute_loss
of an epoch, logged in on_epoch_end):Way too early (provided I understand the expected behavior right), is it not?
The text was updated successfully, but these errors were encountered: