Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename train_dataloader or update docs for trainer.fit()? #3146

Closed
AtomScott opened this issue Aug 25, 2020 · 6 comments
Closed

Rename train_dataloader or update docs for trainer.fit()? #3146

AtomScott opened this issue Aug 25, 2020 · 6 comments
Labels
discussion In a discussion stage feature Is an improvement or enhancement help wanted Open to be worked on won't fix This will not be worked on

Comments

@AtomScott
Copy link

🚀 Feature

  1. Either rename train_dataloder arg to something else, like train_data_source.
    or
  2. Update docs to make it clear that is trainer.fit() and other modules (e.g. trainer.lr_find()) where a LightningDataModule can be passed instead of DataLoader.

Motivation

The new datamodule allows users to decouple data and models.
After #2755, trainer.fit(model, dataloader) and trainer.fit(model, datamodule) both work. Basically there's a line that checks whether the 2nd arg is datamodule instance.

It's nice that we can do positional args without the need for named arguments. But, I think it also causes confusion since, train_dataloader doesn't have to be DataLoader and can be a LightningDataModule instead.

Could be just me, but I got confused when I found trainer.lr_find() was be fine with having a LightningDataModule as its 2nd arg. I had to look into the source code to see that the 2nd arg is just passed and handled with trainer.fit(). So I think the current state is not that friendly to beginners.

Pitch

With #2755 the first arg of trainer.fit() has conceptually become something different from train_dataloader. It can be either a DataLoader or a LightningDataModule.

Therefore I think it may be a good idea to rename the first arg of train_dataloader to something else like train_data_source or even introduce a higher-level concept.

However, regarding backwards compatibility, I'm not sure how appropriate such a change is. How can one rename an important arg with breaking backwards compatibility?

Alternatives

  • Just update the docs for now. I.e. add a note where a LightningDataModule can be passed instead of DataLoader.
  • The type hints are incorrect (haven't been updated) in some places so fix those as well.

Additional context

@AtomScott AtomScott added feature Is an improvement or enhancement help wanted Open to be worked on labels Aug 25, 2020
@Borda
Copy link
Member

Borda commented Aug 25, 2020

I like this, just the case is compatibility with past API, rename would break code of people who are setting parameters explicitly train_dataloder=some_dataloader_or_datamodule...
cc: @williamFalcon

@Borda Borda added the discussion In a discussion stage label Aug 25, 2020
@williamFalcon
Copy link
Contributor

maybe:

train_data:Iterable?

@AtomScott
Copy link
Author

If its a data_module then it could a val_dataloader so I'm not for having "train" in the arg name. Also, data_module isn't an iterable right?

As for backward compatibility, how about just keeping the keyword arg for train_dataloder? Just not have it as the second arg.

@Borda
Copy link
Member

Borda commented Aug 27, 2020

As for backward compatibility, how about just keeping the keyword arg for train_dataloder? Just not have it as the second arg.

well for comparability it shall be quite simple we can just rename the first parameter and then add train_dataloder as last and add some deprecation warning...

@AtomScott
Copy link
Author

Yeah, that's what I meant.

@stale
Copy link

stale bot commented Oct 21, 2020

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label Oct 21, 2020
@stale stale bot closed this as completed Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion In a discussion stage feature Is an improvement or enhancement help wanted Open to be worked on won't fix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants