Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated the DataSpec for the timing abstraction (#146) parts 3 and 4 #178

Merged
merged 12 commits into from
Jan 8, 2022

Conversation

ravi-mosaicml
Copy link
Contributor

@ravi-mosaicml ravi-mosaicml commented Dec 18, 2021

For the timing abstraction (#146), the DataloaderSpec needed two addition functions -- get_num_samples_in_batch and get_num_tokens_in_batch.

Moved the DataSpec class to composer.core, as the DataSpec is now bound directly to the state. #120 will also need this change.

This PR implements part 3 and 4 of the timing abstraction (#146). The implementation differs from the GH issue by adding num_tokens, num_samples, get_batch_size, and get_num_tokens to the new DataSpec rather than the pytorch dataset class.

For the timing abstraction (#146), the `DataloaderSpec` needed two addition functions -- `get_num_samples_in_batch` and `get_num_tokens_in_batch`. It was getting messy to pass around function pointers in a named tuple, so instead converted `DataloaderSpec` from a NamedTuple into a regular class called `DataSpec`. Custom datasets can inherit the base `DataSpec` class and override functionality as needed.

Moved the `DataSpec` class to `composer.core`, as the `DataSpec` is now bound directly to the state. #120 will also need this change.

Renamed `train_dataloader` and `eval_dataloader` in the trainer and state to `train_data` and `eval_data`, since it encompasses more than the dataloader.

This PR implements part 3 and 4 of the timing abstraction (#146). The implementation differs from the GH issue by adding `num_tokens`, `num_samples`, `get_batch_size`, and `get_num_tokens` to the new `DataSpec` rather than the pytorch dataset class.
composer/core/data_spec.py Outdated Show resolved Hide resolved
composer/core/state.py Outdated Show resolved Hide resolved
composer/core/data_spec.py Show resolved Hide resolved
Copy link
Contributor

@hanlint hanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a few suggestions-- I know this PR was just to update the DataSpec, but I think we should take this opportunity to remove DataSpec completely for better usability.

composer/core/state.py Outdated Show resolved Hide resolved
composer/core/data_spec.py Show resolved Hide resolved
composer/trainer/trainer.py Show resolved Hide resolved
@ravi-mosaicml
Copy link
Contributor Author

As discussed offline, we will leave dataspec for now and leave it attached to the state. A later PR will remove it. For the time being, however, dataspecs can be initialized using a dictionary which serves as kwargs for the class constructor.

Copy link
Contributor

@Averylamp Averylamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ravi-mosaicml ravi-mosaicml merged commit 22cef05 into dev Jan 8, 2022
@ravi-mosaicml ravi-mosaicml deleted the ravi/new_dataloader_spec branch January 8, 2022 01:15
coryMosaicML pushed a commit to coryMosaicML/composer that referenced this pull request Feb 23, 2022
…s 3 and 4 (mosaicml#178)

For the timing abstraction (mosaicml#146), the DataloaderSpec needed two addition functions -- get_num_samples_in_batch and get_num_tokens_in_batch.

Moved the DataSpec class to composer.core, as the DataSpec is now bound directly to the state. mosaicml#120 will also need this change.

This PR implements part 3 and 4 of the timing abstraction (mosaicml#146). The implementation differs from the GH issue by adding num_tokens, num_samples, get_batch_size, and get_num_tokens to the new DataSpec rather than the pytorch dataset class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants