Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(autotune): add support for reporting tensor completion order #146

Merged
merged 48 commits into from
Aug 3, 2021

Conversation

shjwudp
Copy link
Member

@shjwudp shjwudp commented Jul 27, 2021

No description provided.

bagua/distributed/launch.py Outdated Show resolved Hide resolved
bagua/distributed/run.py Outdated Show resolved Hide resolved
bagua/service/autotune_service.py Outdated Show resolved Hide resolved
bagua/service/autotune_service.py Outdated Show resolved Hide resolved
bagua/service/autotune_service.py Show resolved Hide resolved
tests/service/test_autotune_service.py Outdated Show resolved Hide resolved
tests/service/test_autotune_service.py Outdated Show resolved Hide resolved
tests/service/test_autotune_service.py Outdated Show resolved Hide resolved
tests/service/test_autotune_service.py Outdated Show resolved Hide resolved
tests/service/test_autotune_service.py Outdated Show resolved Hide resolved
shjwudp and others added 5 commits July 27, 2021 20:28
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@shjwudp shjwudp requested a review from NOBLES5E July 28, 2021 03:19
Copy link
Contributor

@NOBLES5E NOBLES5E left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add distributed GPU end2end test

@shjwudp shjwudp requested a review from NOBLES5E July 29, 2021 09:12
Copy link
Contributor

@NOBLES5E NOBLES5E left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need end2end test with telemetry (with arbitrary tensor registration order to check telemetry works) and assert that autotune indeed improves speed

@shjwudp shjwudp requested a review from NOBLES5E August 2, 2021 13:03
}

logfile=$(mktemp /tmp/bagua_benchmark.XXXXXX.log)
python -m bagua.distributed.run \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

install bagua-core master version first?

Copy link
Contributor

@NOBLES5E NOBLES5E left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

@NOBLES5E NOBLES5E changed the title feat(autotune service): add support for reporting tensor completion order feat(autotune): add support for reporting tensor completion order Aug 3, 2021
@NOBLES5E NOBLES5E merged commit e1a407f into master Aug 3, 2021
@NOBLES5E NOBLES5E deleted the telemetry branch August 3, 2021 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants