Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing tests #936

Merged
merged 10 commits into from
Feb 25, 2020
Merged

Fixing tests #936

merged 10 commits into from
Feb 25, 2020

Conversation

Borda
Copy link
Member

@Borda Borda commented Feb 25, 2020

What does this PR do?

Fixes #918, some changes was not properly propagated

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@Borda Borda added bug Something isn't working need fix labels Feb 25, 2020
@Borda Borda added the priority: 0 High priority task label Feb 25, 2020
@williamFalcon
Copy link
Contributor

@Borda wait for #938.

This PR makes many more changes than just fixing the tests. Please factor out into other PRs so we see the changes clearly.

This PR breaks a lot of GPU tests

@@ -177,14 +177,19 @@ def reset_train_dataloader(self, model):
self.is_iterable_train_dataloader = (
EXIST_ITER_DATASET and isinstance(self.train_dataloader.dataset, IterableDataset)
)
if self.is_iterable_train_dataloader and not isinstance(self.val_check_interval, int):
if self.is_iterable_dataloader(self.train_dataloader) and not isinstance(self.val_check_interval, int):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems wrong. it needs to be self.is_iterable_train_dataloader as defined in 177

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or remove 177

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and what if thou train with another dataloader, checking if it is updated...
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/trainer.py#L857

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand what you mean

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at what point is/was determined if the train dataset is iterable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, found it and it would be fine as it was... shall I rever ti?

@williamFalcon
Copy link
Contributor

@Borda the refactors are nice, so maybe this PR is about refactoring?

@Borda Borda changed the title Fixing tests [blocked by #938] Fixing tests Feb 25, 2020
@Borda
Copy link
Member Author

Borda commented Feb 25, 2020

@Borda the refactors are nice, so maybe this PR is about refactoring?

sure, I wanted to do just a simple cleaning #934 but I found that it is crashing so I opened this to try to fix it... I see that it is still failing so WIP, and shall be done/merged after #938

# when testing make sure user defined a test step
if test and not (self.is_overriden('test_step') or self.is_overriden('test_end')):
if test_mode and not (self.is_overriden('test_step') or self.is_overriden('test_end')):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure after we merge #938 into master that we update the pr here so that auto-merge doesn't reset this line again.

@Borda Borda removed the priority: 0 High priority task label Feb 25, 2020
@williamFalcon
Copy link
Contributor

@Borda unblocked. rebase so we can merge?

@Borda
Copy link
Member Author

Borda commented Feb 25, 2020

there was something wrong, your tests were passing just with caching luck...
I do not have a very good feeling about the prepare_data since the data is not available until you bind it with trainer... nasty fix in Template 🤖

@Borda Borda changed the title [blocked by #938] Fixing tests Fixing tests Feb 25, 2020
@williamFalcon
Copy link
Contributor

@Borda the tests passed on GPUs without any caching...
The prepare_data is needed for TPU/DDP. It was causing a lot of issues without it.

@Borda
Copy link
Member Author

Borda commented Feb 25, 2020

the test is done that is using the same MNIS cache, once you download, it is used for all test...
you can simply check to run just tests.test_trainer.test_trainer_max_steps_and_epochs on the blank repo, but if run all test any other test like tests.test_trainer.test_resume_from_checkpoint_epoch_restored does the download as a first one...

@williamFalcon
Copy link
Contributor

prepare_data goes beyond tests... it's used for DDP and TPU.
If in the tests we happen to cache stuff that's great. But the overall framework needs prepare_data

@Borda
Copy link
Member Author

Borda commented Feb 25, 2020

prepare_data goes beyond tests... it's used for DDP and TPU.
If in the tests we happen to cache stuff that's great. But the overall framework needs prepare_data

it seems that we are talking a bit a diff thing...

  • shall we agree that tests shall pas always and does not matter what order
  • each test shall pass running on a blank system (except specific running conditions)

the case here was that tests.test_trainer.test_trainer_max_steps_and_epochs created model instance from a Template and then it asked for info about dataset before it was bouned to the trainer... then template method __dataloader asked for MNIST dataset without which does not exist on clean environment because it is its fist usage (note that other test model sed our shorten TestingMNIST dataset which has another cache folder)

Really surprised that it was passing on your side lol :]

@williamFalcon
Copy link
Contributor

sure.

@Borda Borda requested a review from a team February 25, 2020 15:56
@Borda Borda added ready PRs ready to be merged and removed need fix labels Feb 25, 2020
@williamFalcon williamFalcon merged commit 5dd2afe into Lightning-AI:master Feb 25, 2020
@Borda Borda deleted the fixing branch February 25, 2020 18:13
tullie pushed a commit to tullie/pytorch-lightning that referenced this pull request Apr 3, 2020
* abs import

* rename test model

* update trainer

* revert test_step check

* move tags

* fix test_step

* clean tests

* fix template

* update dataset path

* fix parent order
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants