Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: update env.py api documentation #15

Merged
merged 10 commits into from
Jun 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ pip install bagua
## Build API documentation locally

```
pip -r doc-requirements.txt
pip install -r doc-requirements.txt
make html
```
37 changes: 24 additions & 13 deletions bagua/torch_api/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,62 +3,73 @@

def get_world_size():
"""
Returns the number of processes in the current process group
Get the number of processes in the current process group.

Returns:
The world size of the process group

The world size of the process group.
"""
return int(os.environ.get("WORLD_SIZE", 1))


def get_rank():
"""
Returns the rank of current process group
Get the rank of current process group.

Rank is a unique identifier assigned to each process within a distributed
process group. They are always consecutive integers ranging from 0 to
``world_size``.

Returns:
The rank of the process group

The rank of the process group.
"""
return int(os.environ.get("RANK", 0))


def get_local_rank():
"""
Returns the rank of current node
Get the rank of current node.

Rank is a unique identifier assigned to each process within a node.
Local rank is a unique identifier assigned to each process within a node.
They are always consecutive integers ranging from 0 to ``local_size``.

Returns:
The local rank of the node

The local rank of the node.
"""
return int(os.environ.get("LOCAL_RANK", 0))


def get_local_size():
"""
Returns the number of processes in the node
Get the number of processes in the node.

Returns:
The local size of the node

The local size of the node.
"""
return int(os.environ.get("LOCAL_SIZE", 1))


def get_autotune_server_addr():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inner interface, users DO NOT need to know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has been hidden

"""
Get autotune server addr.

Returns:
The ip address of autotune server.
"""
return os.environ.get("AUTO_TUNE_SERVER_ADDR")


def is_report_metrics_switch_on():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same like get_autotune_server_addr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has been hidden

"""
Whether bagua report switch is on or not.
"""
return int(os.environ.get("BAGUA_REPORT_METRICS", 0)) == 1


def get_autotune_level():
"""
Get the atuotune level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to explain, or give a ref link to autotune api doc to let the readers know what this is about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same like above. I am going to introduce the use of autotune in the tutorial

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has been hidden


Returns:
The autotune level.
"""
return int(os.environ.get("BAGUA_AUTOTUNE", 0))
9 changes: 9 additions & 0 deletions conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,11 +117,20 @@
"bagua.torch_api.contrib.data.load_balancing_data_loader.LoadBalancingDistributedSampler.shuffle_chunks",
"bagua.torch_api.contrib.data.load_balancing_data_loader.LoadBalancingDistributedBatchSampler.generate_batches",
]
_ignore_functions = [
"bagua.torch_api.env.get_autotune_server_addr",
"bagua.torch_api.env.is_report_metrics_switch_on",
"bagua.torch_api.env.get_autotune_level",
]


def skip_methods(app, what, name, obj, skip, options):
if what == "method" and name in _ignore_methods:
skip = True
return skip
if what == "function" and name in _ignore_functions:
skip = True
return skip
return skip


Expand Down