Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-gpu github action tests using horovod #1009

Merged
merged 9 commits into from
Mar 8, 2023

Conversation

edknv
Copy link
Contributor

@edknv edknv commented Mar 1, 2023

Goals ⚽

This PR adds a GitHub Actions workflow for multi-GPU tests.

Implementation Details 🚧

Testing Details 🔍

GitHub Actions workflow: horovod (2GPU) / gpu-ci

@github-actions
Copy link

github-actions bot commented Mar 1, 2023

Documentation preview

https://nvidia-merlin.github.io/models/review/pr-1009

@edknv edknv self-assigned this Mar 1, 2023
@edknv edknv added the ci label Mar 1, 2023
@edknv edknv added this to the Merlin 23.03 milestone Mar 1, 2023
@edknv edknv changed the title [WIP] Add multi-gpu github action tests using horovod Add multi-gpu github action tests using horovod Mar 1, 2023
@edknv edknv requested a review from jperez999 March 1, 2023 07:22
@edknv edknv marked this pull request as ready for review March 1, 2023 07:22
@rnyak
Copy link
Contributor

rnyak commented Mar 1, 2023

@jperez999 fyi. do you mind reviewing? thanks.

branch=main
if [[ $ref_type == "tag"* ]]
then
git -c protocol.version=2 fetch --no-tags --prune --progress --no-recurse-submodules --depth=1 origin +refs/heads/release*:refs/remotes/origin/release*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, asking for my own edification: what's different in protocol.version=2?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was copied from other places that have this pattern. now available in the branch-name action here. The reason for the protocol.version=2 was to match what the GitHub actions/checkout does in the fetch. It may no longer be strictly necessary since the default protocol is v2 with recent versions of git.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like v2 enables more commands to be sent over a single connection, among other things: https://git-scm.com/docs/protocol-v2

@karlhigley karlhigley merged commit e7fe759 into NVIDIA-Merlin:main Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants