Skip to content

Commit

Permalink
Merge pull request #2137 from recommenders-team/staging
Browse files Browse the repository at this point in the history
Staging to main: AzureML migration to v2
  • Loading branch information
miguelgfierro committed Jul 31, 2024
2 parents d333a0d + f6d3e6b commit 4f86e47
Show file tree
Hide file tree
Showing 30 changed files with 547 additions and 623 deletions.
117 changes: 47 additions & 70 deletions .github/actions/azureml-test/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,108 +6,85 @@
name: azureml-tests
description: "Submit experiment to AzureML cluster"
inputs:
# azureml experiment name
EXP_NAME:
required: true
type: string
# type of test - unit or nightly
description: AzureML experiment Name
ENV_NAME:
required: true
description: AzureML environment Name
TEST_KIND:
required: true
type: string
# test environment - cpu, gpu or spark
TEST_ENV:
required: false
type: string
# azureml compute credentials
description: Type of test - unit or nightly
AZUREML_TEST_CREDENTIALS:
required: true
type: string
# azureml compute subid
description: Credentials for AzureML login
AZUREML_TEST_SUBID:
required: true
type: string
# python version
description: AzureML subscription ID
PYTHON_VERSION:
required: true
type: string
# test group name
description: Python version used for the tests
TEST_GROUP:
required: true
type: string
# cpu cluster name
CPU_CLUSTER_NAME:
required: false
type: string
default: "cpu-cluster"
# gpu cluster name
GPU_CLUSTER_NAME:
required: false
type: string
default: "gpu-cluster"
# AzureML resource group name
description: Test group defined in test_group.py
RG:
required: false
type: string
description: AzureML resource group name
default: "recommenders_project_resources"
# AzureML workspace name
WS:
required: false
type: string
description: AzureML workspace name
default: "azureml-test-workspace"
# test logs path
TEST_LOGS_PATH:
required: false
type: string
default: '"test_logs.log"'
# pytest exit code
PYTEST_EXIT_CODE:
LOG_DIR:
required: false
type: string
default: "pytest_exit_code.log"
description: Directory storing the test logs
default: "test_logs"

runs:
using: "composite"
steps:
- name: Setup python
uses: actions/setup-python@v5
with:
python-version: "3.8"
- name: Install azureml-core and azure-cli on a GitHub hosted server
python-version: "3.10"
- name: Install AzureML Python SDK
shell: bash
run: pip install --quiet "azureml-core>1,<2" "azure-cli>2,<3"
run: pip install --quiet "azure-ai-ml>1,<2" mlflow "azureml-mlflow>1,<2"
- name: Log in to Azure
uses: azure/login@v2
with:
creds: ${{inputs.AZUREML_TEST_CREDENTIALS}}
- name: Install wheel package
shell: bash
run: pip install --quiet wheel
creds: ${{ inputs.AZUREML_TEST_CREDENTIALS }}
- name: Submit tests to AzureML
shell: bash
run: >-
run: |
echo "::group::Running tests ..."
python tests/ci/azureml_tests/submit_groupwise_azureml_pytest.py \
--subid ${{inputs.AZUREML_TEST_SUBID}} \
--reponame "recommenders" \
--branch ${{ github.ref }} \
--rg ${{inputs.RG}} \
--wsname ${{inputs.WS}} \
--expname ${{inputs.EXP_NAME}}_${{inputs.TEST_GROUP}} \
--testlogs ${{inputs.TEST_LOGS_PATH}} \
--testkind ${{inputs.TEST_KIND}} \
--conda_pkg_python ${{inputs.PYTHON_VERSION}} \
--testgroup ${{inputs.TEST_GROUP}} \
--disable-warnings \
--sha "${GITHUB_SHA}" \
--clustername $(if [[ ${{inputs.TEST_GROUP}} =~ "gpu" ]]; then echo "${{inputs.GPU_CLUSTER_NAME}}"; else echo "${{inputs.CPU_CLUSTER_NAME}}"; fi) \
$(if [[ ${{inputs.TEST_GROUP}} =~ "gpu" ]]; then echo "--add_gpu_dependencies"; fi) \
$(if [[ ${{inputs.TEST_GROUP}} =~ "spark" ]]; then echo "--add_spark_dependencies"; fi)
- name: Get exit status
--subid ${{ inputs.AZUREML_TEST_SUBID }} \
--rg ${{ inputs.RG }} \
--ws ${{ inputs.WS }} \
--cluster ${{ contains(inputs.TEST_GROUP, 'gpu') && 'gpu-cluster' || 'cpu-cluster' }} \
--expname ${{ inputs.EXP_NAME }} \
--envname ${{ inputs.ENV_NAME }} \
--testkind ${{ inputs.TEST_KIND}} \
--python-version ${{ inputs.PYTHON_VERSION }} \
--testgroup ${{ inputs.TEST_GROUP }} \
--sha ${GITHUB_SHA}
echo "::endgroup::"
- name: Post tests
if: ${{ ! cancelled() }}
shell: bash
id: exit_status
run: echo "code=$(cat ${{inputs.PYTEST_EXIT_CODE}})" >> $GITHUB_OUTPUT
- name: Check Success/Failure
if: ${{ steps.exit_status.outputs.code != 0 }}
uses: actions/github-script@v7
run: |
echo "::group::Pytest logs"
python tests/ci/azureml_tests/post_pytest.py \
--subid ${{ inputs.AZUREML_TEST_SUBID }} \
--rg ${{ inputs.RG }} \
--ws ${{ inputs.WS }} \
--expname ${{ inputs.EXP_NAME }} \
--log-dir ${{ inputs.LOG_DIR }}
echo "::endgroup::"
- name: Save logs
if: ${{ ! cancelled() }}
uses: actions/upload-artifact@v4
with:
script: |
core.setFailed('All tests did not pass!')
name: logs-${{ inputs.TEST_GROUP }}-python${{ inputs.PYTHON_VERSION }}
path: ${{ inputs.LOG_DIR }}
9 changes: 4 additions & 5 deletions .github/actions/get-test-groups/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,17 @@
name: get-test-groups
description: "Get test group names from tests_groups.py"
inputs:
# type of test - unit or nightly
TEST_KIND:
required: true
type: string
# test environment - cpu, gpu or spark
description: Type of test - unit or nightly
TEST_ENV:
required: false
type: string
description: Test environment - cpu, gpu or spark
default: 'cpu'
outputs:
test_groups:
value: ${{steps.get_test_groups.outputs.test_groups}}
description: A list of test groups
value: ${{ steps.get_test_groups.outputs.test_groups }}

runs:
using: "composite"
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/azureml-cpu-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ on:

# Enable manual trigger
workflow_dispatch:
input:
inputs:
tags:
description: 'Tags to label this manual run (optional)'
default: 'Manual trigger'
Expand Down Expand Up @@ -67,7 +67,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"', '"python=3.11"']
python-version: ["3.8", "3.9", "3.10", "3.11"]
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand All @@ -76,9 +76,9 @@ jobs:
uses: ./.github/actions/azureml-test
id: execute_tests
with:
EXP_NAME: 'nightly_tests'
EXP_NAME: recommenders-nightly-${{ matrix.test-group }}-python${{ matrix.python-version }}-${{ github.ref }}
ENV_NAME: recommenders-${{ github.sha }}-python${{ matrix.python-version }}${{ contains(matrix.test-group, 'gpu') && '-gpu' || '' }}${{ contains(matrix.test-group, 'spark') && '-spark' || '' }}
TEST_KIND: 'nightly'
TEST_ENV: 'cpu'
AZUREML_TEST_CREDENTIALS: ${{ secrets.AZUREML_TEST_CREDENTIALS }}
AZUREML_TEST_SUBID: ${{ secrets.AZUREML_TEST_SUBID }}
PYTHON_VERSION: ${{ matrix.python-version }}
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/azureml-gpu-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ on:

# Enable manual trigger
workflow_dispatch:
input:
inputs:
tags:
description: 'Tags to label this manual run (optional)'
default: 'Manual trigger'
Expand Down Expand Up @@ -67,7 +67,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"', '"python=3.11"']
python-version: ["3.8", "3.9", "3.10", "3.11"]
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand All @@ -76,9 +76,9 @@ jobs:
uses: ./.github/actions/azureml-test
id: execute_tests
with:
EXP_NAME: 'nightly_tests'
EXP_NAME: recommenders-nightly-${{ matrix.test-group }}-python${{ matrix.python-version }}-${{ github.ref }}
ENV_NAME: recommenders-${{ github.sha }}-python${{ matrix.python-version }}${{ contains(matrix.test-group, 'gpu') && '-gpu' || '' }}${{ contains(matrix.test-group, 'spark') && '-spark' || '' }}
TEST_KIND: 'nightly'
TEST_ENV: 'gpu'
AZUREML_TEST_CREDENTIALS: ${{ secrets.AZUREML_TEST_CREDENTIALS }}
AZUREML_TEST_SUBID: ${{ secrets.AZUREML_TEST_SUBID }}
PYTHON_VERSION: ${{ matrix.python-version }}
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/azureml-spark-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ on:

# Enable manual trigger
workflow_dispatch:
input:
inputs:
tags:
description: 'Tags to label this manual run (optional)'
default: 'Manual trigger'
Expand Down Expand Up @@ -66,7 +66,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"', '"python=3.11"']
python-version: ["3.8", "3.9", "3.10", "3.11"]
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand All @@ -75,9 +75,9 @@ jobs:
uses: ./.github/actions/azureml-test
id: execute_tests
with:
EXP_NAME: 'nightly_tests'
EXP_NAME: recommenders-nightly-${{ matrix.test-group }}-python${{ matrix.python-version }}-${{ github.ref }}
ENV_NAME: recommenders-${{ github.sha }}-python${{ matrix.python-version }}${{ contains(matrix.test-group, 'gpu') && '-gpu' || '' }}${{ contains(matrix.test-group, 'spark') && '-spark' || '' }}
TEST_KIND: 'nightly'
TEST_ENV: 'spark'
AZUREML_TEST_CREDENTIALS: ${{ secrets.AZUREML_TEST_CREDENTIALS }}
AZUREML_TEST_SUBID: ${{ secrets.AZUREML_TEST_SUBID }}
PYTHON_VERSION: ${{ matrix.python-version }}
Expand Down
7 changes: 4 additions & 3 deletions .github/workflows/azureml-unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ on:

# Enable manual trigger
workflow_dispatch:
input:
inputs:
tags:
description: 'Tags to label this manual run (optional)'
default: 'Manual trigger'
Expand Down Expand Up @@ -56,7 +56,7 @@ jobs:
strategy:
max-parallel: 50 # Usage limits: https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
matrix:
python-version: ['"python=3.8"', '"python=3.9"', '"python=3.10"', '"python=3.11"']
python-version: ["3.8", "3.9", "3.10", "3.11"]
test-group: ${{ fromJSON(needs.get-test-groups.outputs.test_groups) }}
steps:
- name: Check out repository code
Expand All @@ -65,7 +65,8 @@ jobs:
uses: ./.github/actions/azureml-test
id: execute_tests
with:
EXP_NAME: 'unit_tests'
EXP_NAME: recommenders-unit-${{ matrix.test-group }}-python${{ matrix.python-version }}-${{ github.sha }}
ENV_NAME: recommenders-${{ github.sha }}-python${{ matrix.python-version }}${{ contains(matrix.test-group, 'gpu') && '-gpu' || '' }}${{ contains(matrix.test-group, 'spark') && '-spark' || '' }}
TEST_KIND: 'unit'
AZUREML_TEST_CREDENTIALS: ${{ secrets.AZUREML_TEST_CREDENTIALS }}
AZUREML_TEST_SUBID: ${{ secrets.AZUREML_TEST_SUBID }}
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ The nightly build tests are run daily on AzureML.

## References

- **FREE COURSE**: M. González-Fierro, "Recommendation Systems: A Practical Introduction", LinkedIn Learning, 2024. [Available on this link](https://www.linkedin.com/learning/recommendation-systems-a-practical-introduction).
- D. Li, J. Lian, L. Zhang, K. Ren, D. Lu, T. Wu, X. Xie, "Recommender Systems: Frontiers and Practices", Springer, Beijing, 2024. [Available on this link](https://www.amazon.com/Recommender-Systems-Frontiers-Practices-Dongsheng/dp/9819989639/).
- A. Argyriou, M. González-Fierro, and L. Zhang, "Microsoft Recommenders: Best Practices for Production-Ready Recommendation Systems", *WWW 2020: International World Wide Web Conference Taipei*, 2020. Available online: https://dl.acm.org/doi/abs/10.1145/3366424.3382692
- S. Graham, J.K. Min, T. Wu, "Microsoft recommenders: tools to accelerate developing recommender systems", *RecSys '19: Proceedings of the 13th ACM Conference on Recommender Systems*, 2019. Available online: https://dl.acm.org/doi/10.1145/3298689.3346967
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
"nltk>=3.8.1,<4", # requires tqdm
"notebook>=6.5.5,<8", # requires ipykernel, jinja2, jupyter, nbconvert, nbformat, packaging, requests
"numba>=0.57.0,<1",
"numpy<2.0.0", # FIXME: Remove numpy<2.0.0 once cornac release a version newer than 2.2.1 that resolve ImportError: numpy.core.multiarray failed to import.
"pandas>2.0.0,<3.0.0", # requires numpy
"pandera[strategies]>=0.6.5,<0.18;python_version<='3.8'", # For generating fake datasets
"pandera[strategies]>=0.15.0;python_version>='3.9'",
Expand Down
18 changes: 12 additions & 6 deletions tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ Licensed under the MIT License.

# Tests

Recommenders test pipeline is one of the most sophisticated MLOps pipelines in the open-source community. We execute tests in the three environments we support: CPU, GPU, and Spark, mirroring the tests in each Python version we support. We not only tests the library, but also the Jupyter notebooks in the examples folder.

The reason to have this extensive test infrastructure is to ensure that the code is reproducible by the community and that we can maintain the project with a small number of core contributors.

We currently execute over a thousand tests in the project, and we are always looking for ways to improve the test coverage. To get the exact number of tests, you can run `pytest tests --collect-only`, and then multiply the number of tests by the number of Python versions we support.

In this document we show our test infrastructure and how to contribute tests to the repository.

## Table of Contents
Expand Down Expand Up @@ -74,15 +80,10 @@ In this section we show how to create tests and add them to the test pipeline. T

### How to create tests for the Recommenders library

You want to make sure that all your code works before you submit it to the repository. Here are some guidelines for creating the unit tests:
You want to make sure that all your code works before you submit it to the repository. Here are some guidelines for creating the tests:

* It is better to create multiple small tests than one large test that checks all the code.
* Use `@pytest.fixture` to create data in your tests.
* Use the mark `@pytest.mark.gpu` if you want the test to be executed
in a GPU environment. Use `@pytest.mark.spark` if you want the test
to be executed in a Spark environment.
* Use `@pytest.mark.notebooks` if you are testing a notebook.
* Avoid using `is` in the asserts, instead use the operator `==`.
* Follow the pattern `assert computation == value`, for example:
```python
assert results["precision"] == pytest.approx(0.330753)
Expand All @@ -92,6 +93,11 @@ assert results["precision"] == pytest.approx(0.330753)
assert rmse(rating_true, rating_true) == 0
assert rmse(rating_true, rating_pred) == pytest.approx(7.254309)
```
* Use the operator `==` with values. Use the operator `is` in singletons like `None`, `True` or `False`.
* Make explicit asserts. In other words, make sure you assert to something (`assert computation == value`) and not just `assert computation`.
* Use the mark `@pytest.mark.gpu` if you want the test to be executed in a GPU environment. Use `@pytest.mark.spark` if you want the test to be executed in a Spark environment.
* Use `@pytest.mark.notebooks` if you are testing a notebook.


### How to create tests for the notebooks

Expand Down
Loading

0 comments on commit 4f86e47

Please sign in to comment.