Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add support to specify input and output data names #953

Merged
merged 3 commits into from
Aug 30, 2024

Conversation

fepegar
Copy link
Contributor

@fepegar fepegar commented Aug 21, 2024

When downloading or mounting a data asset on AML, it can be given a "name" that can be used to retrieve the target folder dynamically inside the submitted job. This PR adds support to specify names, alternatively to the default INPUT_0 etc.

Before:

dataset_config = DatasetConfig(
    name=path,
    version=version,
    use_mounting=mount,
)

image

After:

dataset_config = DatasetConfig(
    name=path,
    version=version,
    use_mounting=mount,
    data_name="eval_data_dir",
)

image

The command may now include an argument ${{inputs.eval_data_dir}} which will be interpolated to the local folder where the data lives during the run.

@fepegar fepegar changed the title Add support to specify input and output data names ENH: Add support to specify input and output data names Aug 21, 2024
@fepegar fepegar requested review from ant0nsc and samb-t August 21, 2024 12:20
Copy link
Collaborator

@ant0nsc ant0nsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great and is a nice change in particular for SDK v2 jobs.
It would be nice if you could add documentation here https://hi-ml.readthedocs.io/en/latest/datasets.html that shows this flag in action.

hi-ml-azure/src/health_azure/datasets.py Show resolved Hide resolved
hi-ml-azure/src/health_azure/datasets.py Show resolved Hide resolved
@fepegar fepegar marked this pull request as draft August 22, 2024 16:54
@fepegar fepegar marked this pull request as ready for review August 30, 2024 12:20
@fepegar fepegar merged commit 0501bd4 into main Aug 30, 2024
43 checks passed
@fepegar fepegar deleted the fperezgarcia/support-input-output-name branch August 30, 2024 12:20
@fepegar
Copy link
Contributor Author

fepegar commented Aug 30, 2024

Thanks @samb-t and @ant0nsc for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants