In this tutorial, you will learn how to train your own image classification model using transfer learning.
The Azure Machine Learning python SDK's PyTorch estimator enables you to easily submit PyTorch training jobs for both single-node and distributed runs on Azure compute. The model is trained to classify dog breeds using a pretrained ResNet18 model that has been trained on the Stanford Dog dataset. This dataset has been built using images and annotation from ImageNet for the task of fine-grained image categorization. To reduce workshop time, we will use a subset of this dataset which includes 10 dog breeds.
The workshop consists of two parts:
- Using Jupyter Notebooks to faciliate rapid prototyping and experimentation
- Operationalising your models using Git, Azure DevOps, and Azure Kubernetes Service
- A laptop where you have admin rights to install and run local applications
- An Azure Subscription with Contributor role or a pre-created Resource Group where you have Contributor role assigned to your Azure AD user.
- Access to the Following Azure resources:
- Azure Machine Learning Service
- Azure Container Registry
- Azure Storage Account
- Azure Key Vault
- Azure Application Insights
- Azure Container Instances
- Azure Kubernetes Service
- Azure Virtual Machines
- (Optional) Access to GPU VM SKUs (e.g.
Standard_NC6
orStandard_NC6s_v2
) -- otherwise substitute for a CPU SKU; training may take longer.
Check that the Azure resources and VM series you will use are available in your region; otehrwise choose a different region for these services.
Note: If you are using a corporate subscription you may encounter issues due to Azure Policy or other restrictions put in place by your employer.
You can view the subset of the dog breed data used here.
If you plan to use a local development environment, follow these steps to configure it correctly.
Install Miniconda for your OS (Windows/macOS/Linux): https://docs.conda.io/en/latest/miniconda.html
Follow these Python SDK steps to create an isolated Python environment with the azureml-sdk, jupyter notebook and other required dependencies.
For the first part of the workshop, you'll use a typical Data Science workflow based around using interactive Jupyter Notebooks.
Open a terminal and change to the root directory where you cloned this git repository.
Launch Jupyter Notebook:
cd <git-repo-dir>
jupyter notebook
From the Jupyter notebooks web interface, open the dog-breed-classifier.ipynb notebook and follow the instructions.
Now it's time to operationalise the machine learning process so that changes to the model code or training data can be automated using DevOps pipelines.
For the second part of the workshop, refer to the MLOps guide.
Do not commit changes to notebook files where you have run through the workshop yourself and filled in placeholders or have output(s) shown after executing cells.
- Revert placeholders back to their original states
- Clear cell outputs: from the Jupyter notbook interface select: Cell > All Output > Clear
- This workshop was forked from here: https://github.com/ronglums/PyCon-2019