Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models docstrings #469

Merged
merged 135 commits into from
Mar 9, 2022
Merged

Models docstrings #469

merged 135 commits into from
Mar 9, 2022

Conversation

A-Jacobson
Copy link
Contributor

@A-Jacobson A-Jacobson commented Feb 11, 2022

#401

  • not sure how far i should go with the loss.py, initializers.py and model architecture files. The first two probably need a good refactor, loss -> metrics and initializers -> (not sure, maybe deletion). The model architectures are copied from all over the place. Am adding docstrings but am reluctant to refactor. Our ComposerModel versions should be the public facing interface for these.

  • unsure about module level docstring, should i do something like this? https://docs.mosaicml.com/en/v0.3.1/models.html. or link to the incoming model cards that @ajaysaini725 is writing to cover the model descriptions.

@ravi-mosaicml
Copy link
Contributor

  • not sure how far i should go with the loss.py, initializers.py and model architecture files. The first two probably need a good refactor, loss -> metrics and initializers -> (not sure, maybe deletion). The model architectures are copied from all over the place. Am adding docstrings but am reluctant to refactor. Our ComposerModel versions should be the public facing interface for these.

If a refactor is desperately needed, then let's do it. For PR review, though, could this refactor be done separately?

A stub that would link to the model cards works.

@hanlint hanlint added the documentation Improvements or additions to documentation label Feb 16, 2022
@hanlint
Copy link
Contributor

hanlint commented Feb 16, 2022

Thanks for putting this together! A few comments:

ComposerModel:

  • Can you add more information to the ComposerModel base class docstrings? e.g. especially the coupling of forward/loss and validate/metrics.

  • forward: its worth mentiong that the input is from the dataloader. we want to tell folks as much about our trainer loop logic as possible. same thing for loss -- say that outputs come from forward, and batch from the dataloader

  • metrics: similarly explain that the metrics shuld be torchmetrics. The "warning" should hide the torchmetrics link (e.g. See torchmetrics[link] for more details). Mention that one can return multiple metrics that would all be evaluated on the eval_dataloader.

  • ComposerTransformer docstring should also be expanded with more information about how it works and example usages

  • same for TIMM, BERT, ResNet, GPTmodel, etc, we need some code blocks that show its usage, and more detail on how it works.

  • TIMM has some formatting issues with testcode and also the model_name and links in the parameters. See: https://mosaicml-composer--469.com.readthedocs.build/en/469/api_reference/composer.models.timm.model.html#composer.models.timm.model.Timm

@hanlint hanlint linked an issue Feb 17, 2022 that may be closed by this pull request
@A-Jacobson A-Jacobson changed the title [WIP] models docstrings docstrings for models Feb 18, 2022
@A-Jacobson A-Jacobson changed the title docstrings for models Models docstrings Feb 18, 2022
@A-Jacobson A-Jacobson added this to the v0.4 milestone Feb 18, 2022
@A-Jacobson A-Jacobson requested a review from growlix March 9, 2022 11:28
@A-Jacobson
Copy link
Contributor Author

A-Jacobson commented Mar 9, 2022

Done:

  • Added all file level docstrings.
  • embedding links.
  • added all the `` marks...everywhere.
  • vit docstrings, sorta minimal.
  • bare minimum ssd docstrings.
  • refactored effecientnet to hide non user-facing blocks.
  • removed initializers from resnet9 init as they weren't being used. you can add them back if you add the logic to actually apply them =)

Not Done:

  • embedded paper links, they are cited as [Name](Link) not [Name](link)(Author et al, Date). Should i do (Author et al, Date)? Is there a relatively painless way to get this information?
  • some links are still broken: transformers, base types, some torchmetrics. These broken links don't have readthedocs paths available for us. I've linked to the relevant documentation or github pages for some. I've left others alone either because we're working to remove them (Batch types), or they're in an awkward place to add a normal link (Return or Args). Open to suggestions here but I do think it's ok to send this off without them.
  • codeblocks -> doctests in abstract classes, punting to a later PR as they require fixtures + some are psuedocode.
  • State formatting on metrics, adding them as attrs will cause more trouble than living with the less pretty formatting for now.
  • SSD is a lot.. I know it's not perfect but I'd like to get this PR moving before even more models appear.
  • general refactoring (standardizing module naming and layout) is being punted to here, composer.models cleanup #536. This PR is already too crazy, lets end the pain.

@growlix I choose you!

@growlix
Copy link
Contributor

growlix commented Mar 9, 2022

Looking great, Austin. Hope your eyes/fingers aren't too bloody.

embedded paper links, they are cited as Name not Name(Author et al, Date). Should i do (Author et al, Date)? Is there a relatively painless way to get this information?

I'd go Name(Author et al, Date) if possible; Author et al, Date is also acceptable. I think author attribution is the most important part. Unfortunately I don't know if there's an easy way to do this other than clicking through to the actual page.

some links are still broken: transformers, base types, some torchmetrics. These broken links don't have readthedocs paths available for us. I've linked to the relevant documentation or github pages for some. I've left others alone either because we're working to remove them (Batch types), or they're in an awkward place to add a normal link (Return or Args). Open to suggestions here but I do think it's ok to send this off without them.

Agree. I think you can leave as-is.

codeblocks -> doctests in abstract classes, punting to a later PR as they require fixtures + some are psuedocode.

Sounds good, just remember to file an issue.

SSD is a lot.. I know it's not perfect but I'd like to get this PR moving before even more models appear.

I'll double check for anything egregious, but otherwise sounds good, file an issue.

State formatting on metrics, adding them as attrs will cause more trouble than living with the less pretty formatting for now.

As per Bandish's suggestion, I think you can just move the state info to a new paragraph in the summary and preface it with something like

Adds the following attributes to :class:`composer.core.state`:

(Don't quote my syntax on that)

Unfortunately I think the current rendering is a bit too sloppy.

general refactoring (standardizing module naming and layout) is being punted to here, #536. This PR is already too crazy, lets end the pain.

  • 1

@A-Jacobson
Copy link
Contributor Author

A-Jacobson commented Mar 9, 2022

Copy link
Contributor

@growlix growlix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSD files need __all__, but otherwise looks p good

composer/models/ssd/ssd.py Outdated Show resolved Hide resolved
composer/models/ssd/ssd.py Show resolved Hide resolved
composer/models/ssd/ssd300.py Show resolved Hide resolved
composer/models/ssd/ssd300.py Show resolved Hide resolved
composer/models/ssd/ssd_hparams.py Show resolved Hide resolved
composer/models/ssd/utils.py Show resolved Hide resolved
A-Jacobson and others added 7 commits March 9, 2022 13:50
Co-authored-by: Matthew <growlix@users.noreply.github.com>
Co-authored-by: Matthew <growlix@users.noreply.github.com>
Co-authored-by: Matthew <growlix@users.noreply.github.com>
@A-Jacobson A-Jacobson merged commit 4d8ebc1 into dev Mar 9, 2022
@A-Jacobson A-Jacobson deleted the models-docs branch March 9, 2022 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docstrings cleanup! composer.models
4 participants