Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental: automatic whitelist/blacklist for fp16 training #2

Closed
wants to merge 4 commits into from

Conversation

cbcase
Copy link
Contributor

@cbcase cbcase commented May 17, 2018

This PR contains v0.1 of an experimental tool to automatically enable fp16 in PyTorch with a whitelist/blacklist model for handling automatic type conversion. The included README describes both the interface and implementation in more detail.

The primary apex-wide change is a couple hacks to enable compiling the custom fused scaling + overflow checking kernel that amp uses for it's loss scaling. Carilli and I are planning to move all the kernel implementations to the new (as of 0.4) cpp-extension interface, so my hope is these hacks are short-lived.

The strangest hack is including the __init__.py files under apex/amp/_C/ in the source tree. They are synthesized by cffi, but when it's run for packaging (ie, python setup.py install), it doesn't synthesize them -- so the only way to get them correctly copied over to the install directory is for them to already exist. This will all go away soon.

@cbcase cbcase requested a review from csarofeen May 17, 2018 22:57
@mcarilli
Copy link
Contributor

Merged by command line.

@mcarilli mcarilli closed this May 18, 2018
@jinserk jinserk mentioned this pull request Aug 30, 2018
rohithkrn pushed a commit to rohithkrn/apex that referenced this pull request May 8, 2020
@mksenzov mksenzov mentioned this pull request Jun 4, 2020
thorjohnsen pushed a commit that referenced this pull request Jul 21, 2020
Fixing mask multiplication with grad tensors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants