ENH: Add support to specify input and output data names #953

fepegar · 2024-08-21T12:11:42Z

When downloading or mounting a data asset on AML, it can be given a "name" that can be used to retrieve the target folder dynamically inside the submitted job. This PR adds support to specify names, alternatively to the default INPUT_0 etc.

Before:

dataset_config = DatasetConfig(
    name=path,
    version=version,
    use_mounting=mount,
)

After:

dataset_config = DatasetConfig(
    name=path,
    version=version,
    use_mounting=mount,
    data_name="eval_data_dir",
)

The command may now include an argument ${{inputs.eval_data_dir}} which will be interpolated to the local folder where the data lives during the run.

ant0nsc

Overall this looks great and is a nice change in particular for SDK v2 jobs.
It would be nice if you could add documentation here https://hi-ml.readthedocs.io/en/latest/datasets.html that shows this flag in action.

hi-ml-azure/src/health_azure/datasets.py

fepegar · 2024-08-30T12:20:53Z

Thanks @samb-t and @ant0nsc for reviewing!

Add support to specify input and output data names

12b5ba9

fepegar changed the title ~~Add support to specify input and output data names~~ ENH: Add support to specify input and output data names Aug 21, 2024

fepegar requested review from ant0nsc and samb-t August 21, 2024 12:20

samb-t approved these changes Aug 21, 2024

View reviewed changes

ant0nsc approved these changes Aug 22, 2024

View reviewed changes

hi-ml-azure/src/health_azure/datasets.py Show resolved Hide resolved

hi-ml-azure/src/health_azure/datasets.py Show resolved Hide resolved

fepegar marked this pull request as draft August 22, 2024 16:54

samb-t and others added 2 commits August 28, 2024 10:32

Merge branch 'main' into fperezgarcia/support-input-output-name

484a491

Improve documentation for new feature

35e670a

fepegar marked this pull request as ready for review August 30, 2024 12:20

fepegar merged commit 0501bd4 into main Aug 30, 2024
43 checks passed

fepegar deleted the fperezgarcia/support-input-output-name branch August 30, 2024 12:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add support to specify input and output data names #953

ENH: Add support to specify input and output data names #953

fepegar commented Aug 21, 2024 •

edited

Loading

ant0nsc left a comment

fepegar commented Aug 30, 2024

ENH: Add support to specify input and output data names #953

ENH: Add support to specify input and output data names #953

Conversation

fepegar commented Aug 21, 2024 • edited Loading

ant0nsc left a comment

Choose a reason for hiding this comment

fepegar commented Aug 30, 2024

fepegar commented Aug 21, 2024 •

edited

Loading