Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update train.py #543

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

mst-rajatmishra
Copy link

this PR is primarily focused on improving code quality and robustness. The improvements include better environment variable handling, enhanced type hinting, more descriptive logging, robust error handling, configuration validation, and improved code readability.

ThankYou,
RAJAT MISHRA.

os.environ.pop("SLURM_NTASKS_PER_NODE", None)
# Set environment variables
os.environ["USE_LIBUV"] = "0"
os.environ["SLURM_NTASKS"] = os.environ.get("SLURM_NTASKS", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove line 16-18

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes if we don’t need to handle SLURM-related environment variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need that, it should work fine in most slurm envs.

@leng-yue
Copy link
Member

Also... I don't see there is a need to make all first-letter of comments uppercase...

leng-yue
leng-yue previously approved these changes Sep 13, 2024
@@ -1,8 +1,6 @@
import os

os.environ["USE_LIBUV"] = "0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libuv should be in front of everything

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so that environment variable is set before any libraries that might depend on it are imported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be set before pytorch, so at least before lightning / DDP imported

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha!

@mst-rajatmishra
Copy link
Author

if there will be anything else let me know i will try!👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants