Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_download_file_from_run silently does nothing if is_local_rank_zero() returns False #915

Open
fepegar opened this issue Dec 29, 2023 · 0 comments

Comments

@fepegar
Copy link
Contributor

fepegar commented Dec 29, 2023

This might be dangerous because the way health_azure checks whether is_local_rank_zero might be compatible with Lightning but not with other frameworks, and the files are unexpectedly missing after being "downloaded". This is especially problematic when validate_checksum is True, because one would expect that things are double-checked after downloading.

def _download_file_from_run(
run: Run, filename: str, output_file: Path, validate_checksum: bool = False
) -> Optional[Path]:
"""
Download a single file from an Azure ML Run, optionally validating the content to ensure the file is not
corrupted during download. If running inside a distributed setting, will only attempt to download the file
onto the node with local_rank==0. This prevents multiple processes on the same node from trying to download
the same file, which can lead to errors.
:param run: The AML Run to download associated file for
:param filename: The name of the file as it exists in Azure storage
:param output_file: Local path to which the file should be downloaded
:param validate_checksum: Whether to validate the content from HTTP response
:return: The path to the downloaded file if local rank is zero, else None
"""
if not is_local_rank_zero():
return None
run.download_file(filename, output_file_path=str(output_file), _validate_checksum=validate_checksum)
return output_file

A minimally-invasive change would be logging a warning before returning None. It might also be nice to propagate the paths of the downloaded files up to other functions that call this one. For example, maybe _download_files_from_run and download_files_from_run_id should return the paths of all downloaded files.

@fepegar fepegar changed the title _download_file_from_run silently does nothing if is_local_rank_zero() returns True _download_file_from_run silently does nothing if is_local_rank_zero() returns False Apr 16, 2024
@fepegar fepegar changed the title _download_file_from_run silently does nothing if is_local_rank_zero() returns False _download_file_from_run silently does nothing if is_local_rank_zero() returns False Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant