Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loader docs #1416

Merged
merged 13 commits into from
Apr 8, 2020
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ PyTorch Lightning Documentation
hooks
hyperparameters
multi_gpu
multiple_loaders
weights_loading
optimizers
profiler
Expand Down
16 changes: 16 additions & 0 deletions docs/source/introduction_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,22 @@ to use inheritance to very quickly create an AutoEncoder.

---------

Installing Lightning
--------------------
Lightning is trivial to install.

.. code-block::

$ conda activate my_env
$ pip install pytorch-lightning

Or without conda environments, anywhere you can use pip.

.. code-block::
$ pip install pytorch-lightning
Borda marked this conversation as resolved.
Show resolved Hide resolved

---------

Lightning Philosophy
--------------------
Lightning factors DL/ML code into three types:
Expand Down
66 changes: 66 additions & 0 deletions docs/source/multiple_loaders.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
Multiple Datasets
=================
Lightning supports multiple dataloaders in a few ways.

1. Create a dataloader that iterates both datasets under the hood.
2. In the validation and test loop you also have the option to return multiple dataloaders
which lightning will call sequentially.
Borda marked this conversation as resolved.
Show resolved Hide resolved

Multiple training dataloaders
-----------------------------
For training, the best way to use multiple-dataloaders is to create a Dataloader class
which wraps both your dataloaders. (This of course also works for testing and validation
dataloaders).

(`reference <https://discuss.pytorch.org/t/train-simultaneously-on-two-datasets/649/2>`_

.. code-block::

class ConcatDataset(torch.utils.data.Dataset):
def __init__(self, *datasets):
self.datasets = datasets

def __getitem__(self, i):
return tuple(d[i] for d in self.datasets)

def __len__(self):
return min(len(d) for d in self.datasets)

concat_dataset = ConcatDataset(
datasets.ImageFolder(traindir_A),
datasets.ImageFolder(traindir_B)
)

class LitModel(LightningModule):
def train_dataloader(self):
loader = torch.utils.data.DataLoader(
concat_dataset,
batch_size=args.batch_size,
shuffle=True,
num_workers=args.workers,
pin_memory=True
)
return loader

def val_dataloader(self):
# SAME

def test_dataloader(self):
# SAME

Test/Val dataloaders
--------------------
For validation, test dataloaders lightning also gives you the additional
option of passing in multiple dataloaders back from each call.

See the following for more details:

- :meth:`~pytorch_lightning.core.LightningModule.val_dataloader`
- :meth:`~pytorch_lightning.core.LightningModule.test_dataloader`

.. code-block::

def val_dataloader(self):
loader_1 = Dataloader()
loader_2 = Dataloader()
return [loader_1, loader_2]