Global AI Workshops
  • Welcome
  • Getting started
    • Lab 1 - Get Azure Access
    • Lab 2 - Setup Dev Environment
  • The Azure Function that can see
    • Introduction
    • Lab 1 - Train your model
    • Lab 2 - Create the Function
    • Clean up
  • Cognitive Services
    • Lab 1 - Cognitive Search
    • Lab 2 - Bot Composer
  • Azure machine learning
    • Introduction
    • Lab 1 - AML Workspace Setup
    • Lab 2 - Data
    • Lab 3 - Train your model
    • Lab 4 - Deploy to an ACI
      • Lab 4.2 - Deploy to ACI using the Azure Portal
      • Lab 4.1 - Deploy to ACI using Python
    • Lab 5 - Deploy to Managed Endpoint
    • Clean Up
Powered by GitBook
On this page
  • Download the training Script
  • Create a training job
  • Checklist

Was this helpful?

  1. Azure machine learning

Lab 3 - Train your model

PreviousLab 2 - DataNextLab 4 - Deploy to an ACI

Last updated 3 years ago

Was this helpful?

Download the training Script

Start with creating a folder for your training scripts.

# Create a directory
mkdir train
cd train

# Download the training script
wget https://raw.githubusercontent.com/GlobalAICommunity/back-together-2021/main/workshop-assets/amls/train.py

# Go back to the project directory
cd .. 

This training script is a slightly modified version of the Transfer Learning for Computer vision Tutorial on the .

Create a training job

Now that we have a training script we need to configure how the training job is going to run in the cloud.

We start with creating an empty yaml file.

code job.yml

In this file we are going to configure how to execute and run our training file. Copy and paste the content below in the job.yml.

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
experiment_name: SimpsonsClassification
code:
  local_path: ./train
command: python train.py --data {inputs.training_data} --num-epochs 8 --model-name SimpsonsClassification
environment: azureml:AzureML-pytorch-1.7-ubuntu18.04-py37-cuda11-gpu:3
compute:
  target: azureml:gpu-cluster
inputs:
  training_data:
    mode: mount
    data: azureml:LegoSimpsons:1

code local_path

This is the folder that contains the train.py and other files needed for your job to run successful. Everything is this folder is copied over to the experiment artifacts.

command

The command

Now we can create the job with the command below. The job takes around 5-10 minutes to complete.

az ml job create --file job.yml --query name -o tsv

The "--query name -o tsv" command prints the name of the run in the console. Copy this name and put it in <run_name> in the command below.

While the job is running, you can stream the live output of the job using the command below.

az ml job stream -n <run_name>

In the current version of the SDK the command above does not work.

If you just want to see the status of the job use the command below.

az ml job show -n <run_name> --query status -o tsv

The final step in the training scripts registers a PyTorch Model and a PyTorch model converted to ONNX. The names of the models are: SimpsonsClassification-onnx and SimpsonsClassification-pytorch

az ml model list -o table

Checklist

Now you have 2 versioned models that can classify Simpson Images in your Azure Machine Learning Workspace.

You have:

PyTorch website