Skip to content

ResNet18 deployment on AWS Lambda

Szymon Maszke edited this page Apr 14, 2020 · 7 revisions

Below you can find an introductory tutorial describing deployment of ResNet18 image classifier using torchlambda. This is only an example, for more sophisticated use cases (e.g. base64 encoding of image or testing deployment locally) see other tutorials section.

1. Create model to deploy

Below is a code (model.py) to load ResNet from torchvision and compile it as torchscript:

import torch
import torchvision

model = torchvision.models.resnet18()

torch.jit.script(model).save("model.ptc")

Invoke it from CLI:

$ python model.py

You should get model.ptc in your current working directory.

2. Create settings

torchlambda uses C++ to deploy models hence it might be harder for end users to provide necessary source code.

To alleviate some of those issues, easy to understand YAML settings can be used to define outputs and various elements of neural network's deployment.

Please run the following:

torchlambda settings

This command will generate torchlambda.yaml file with all available commands for you to modify according to your needs. You can see all of them with short description below.

Click here to check generated YAML settings
---
grad: False # Turn gradient on/off
validate_json: true # Validate correctnes of JSON parsing
model: /opt/model.ptc # Path to model to load
input: # Define properties of input
  name: data # Name of field containing data
  validate_field: true # Whether above field will be checked for correctness
  type: float # Type of data in this field (array assumed or base64)
  shape: [1, 3, width, height] # Input shapes (int or name of field as str)
  validate_shape: true # Whether to validate fields containing shape info
cast: float # Type to which tensor will be casted before inference (if any)
divide: 255 # Value by which it will be divided (if any)
normalize: # Whether to normalize the tensor
  means: [0.485, 0.456, 0.406] # Using those means
  stddevs: [0.229, 0.224, 0.225] # And those standard deviations
return: # Finally return something in JSON
  output: # Unmodified output from neural network
    type: double # Casted to double type (AWS SDK compatible)
    name: output # Name of the field where value(s) will be returned
    item: false # If we return single value use True, neural network usually returns more (an array)
  result: # Return another field result by modifying output
    operations: argmax # Apply argmax (more operations can be specified as list)
    arguments: 1 # Over first dimension (more or no arguments can be specified)
    type: int # Type returned will be integer
    name: result # Named result
    item: true # It will be a single item
 

Many fields already have sensible defaults (see YAML settings file reference) hence they will be left for now. In our case we will only define bare minimum:

---
input:
  shape: [1, 3, width, height]
  type: byte
  cast: float
  divide: 255
normalize:
  means: [0.485, 0.456, 0.406]
  stddevs: [0.229, 0.224, 0.225]
return:
  result:
    operations: argmax
    type: int
    name: label
    item: true
  • input - tensor of shape [1, 3, width, height], where first two dimensions are batch and channel (always static and equal to 1 and 3 respectively), and variable width and height. Exact width and height will be passed as int fields in JSON request. type of tensor is specified as byte (image with values in range [0, 255], will be casted on AWS Lambda's side to C++'s uint8_t). Created tensor will be casted to float and divided by 255 to be in [0,1] range as ResNet18 expects.
  • Data will be normalized per channels with ImageNet pre-calculated means and standard deviations.
  • return - return output of the network modified by argmax operation which creates result. Our returned type will be int, and JSON field name (torchlambda always returns JSONs) will be label. argmax over tensor will create single (by default the operation is applied over all dimension), hence item is specified.

Save the above content in torchlambda.yaml.

3. Create deployment code

Now that we have our settings we can generate C++ code based on it. Run the following:

$ torchlambda template --yaml torchlambda.yaml

You should see a new folder called torchlambda in your current directory with main.cpp file inside.

If you don't care about C++ you can move on to the next section. If you want to know a little more (or your deployment needs more customization), carry on reading.

If YAML settings cannot fulfil your needs torchlambda offers you a basic C++ template you can start your deployment code from.

Run this simple command (no settings needed in this case):

$ torchlambda template --destination custom_deployment

This time you can find new folder custom_deployment with main.cpp inside. This file is a minimal reasonable and working C++ code one should be able to follow easily. It does exactly the same thing (except dynamic shapes) as we did above via settings but this time the file is readable (previous main.cpp might be quite hard to grasp as it's "autogenerated").

Click here to check generated code
#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>

#include <aws/lambda-runtime/runtime.h>

#include <torch/script.h>
#include <torch/torch.h>

/*!
 *
 *                    HANDLE REQUEST
 *
 */

static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
        const Aws::Utils::Base64::Base64 &transformer,
        const aws::lambda_runtime::invocation_request &request) {

  const Aws::String data_field{"data"};

  /*!
   *
   *              PARSE AND VALIDATE REQUEST
   *
   */

  const auto json = Aws::Utils::Json::JsonValue{request.payload};
  if (!json.WasParseSuccessful())
    return aws::lambda_runtime::invocation_response::failure(
        "Failed to parse input JSON file.", "InvalidJSON");

  const auto json_view = json.View();
  if (!json_view.KeyExists(data_field))
    return aws::lambda_runtime::invocation_response::failure(
        "Required data was not provided.", "InvalidJSON");

  /*!
   *
   *          LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
   *
   */

  const auto base64_data = json_view.GetString(data_field);
  Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);

  torch::Tensor tensor =
      torch::from_blob(decoded.GetUnderlyingData(),
                       {
                           static_cast<long>(decoded.GetLength()),
                       },
                       torch::kUInt8)
          .reshape({1, 3, 64, 64})
          .toType(torch::kFloat32) /
      255.0;

  torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
      {0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);

  /*!
   *
   *                      MAKE INFERENCE
   *
   */

  auto output = module.forward({normalized_tensor}).toTensor();
  const int label = torch::argmax(output).item<int>();

  /*!
   *
   *                       RETURN JSON
   *
   */

  return aws::lambda_runtime::invocation_response::success(
      Aws::Utils::Json::JsonValue{}
          .WithInteger("label", label)
          .View()
          .WriteCompact(),
      "application/json");
}

int main() {
  /*!
   *
   *                        LOAD MODEL ON CPU
   *                    & SET IT TO EVALUATION MODE
   *
   */

  /* Turn off gradient */
  torch::NoGradGuard no_grad_guard{};
  /* No optimization during first pass as it might slow down inference by 30s */
  torch::jit::setGraphExecutorOptimize(false);

  constexpr auto model_path = "/opt/model.ptc";

  torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);
  module.eval();

  /*!
   *
   *                        INITIALIZE AWS SDK
   *                    & REGISTER REQUEST HANDLER
   *
   */

  Aws::SDKOptions options;
  Aws::InitAPI(options);
  {
    const Aws::Utils::Base64::Base64 transformer{};
    const auto handler_fn =
        [&module,
         &transformer](const aws::lambda_runtime::invocation_request &request) {
          return handler(module, transformer, request);
        };
    aws::lambda_runtime::run_handler(handler_fn);
  }
  Aws::ShutdownAPI(options);
  return 0;
}

For more info run torchlambda template --help or check out documentation.

4. Create payload

Requests for AWS Lambda functions created by torchlambda should be in JSON format. Copy and run the code below to create an example small payload in desired format:

import json

import numpy as np


def create_payload():
    width = 64
    height = 64
    data = (
        np.random.randint(low=0, high=255, size=(1, 3, width, height))
        .flatten()
        .tolist()
    )

    payload = {"width": width, "height": height, "data": data}

    with open("payload.json", "w") as file:
        json.dump(payload, file)


if __name__ == "__main__":
    create_payload()

It should create payload.json file with randomly generated image in data field of width and height equal to 64. Keep this file around as it will be needed at the end.

Notice small image size. If you wish to send larger images to AWS Lambda you should use base64 encoding described in base64 image encoding tutorial.

5. Package your source as .zip

Now we have our model and source code. It's time to deploy it as AWS Lambda ready .zip package.

Run from command line:

$ torchlambda build ./torchlambda --compilation "-Wall -O2"

Above will create torchlambda.zip file ready for deploy. Notice --compilation where you can pass any C++ compilation flags (here -O2 for increased performance).

There are many more things one could set during this step, check torchlambda build --help or documentation for full list of available options and description.

6. Package your model as AWS Lambda Layer

Our source code is roughly 30Mb in size (AWS Lambda has 250Mb limit), hence we can put our model as additional layer (so AWS S3 won't be involved). To create it run:

$ torchlambda layer ./model.ptc --destination "model.zip"

You will receive model.zip layer in your current working directory (--destination is optional). See torchlambda layer --help or documentation for more info.

7. Deploy to AWS Lambda

From now on you could mostly follow tutorial from AWS Lambda's C++ Runtime. It is assumed you have AWS CLI configured, check Configuring the AWS CLI otherwise (or see Test Lambda deployment locally tutorial)

7.1 Create trust policy JSON file

First create the following trust policy JSON file:

$ cat trust-policy.json
{
 "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": ["lambda.amazonaws.com"]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

7.2 Create IAM role trust policy JSON file

Run from your shell:

$ aws iam create-role --role-name demo --assume-role-policy-document file://trust-policy.json

Note down the role Arn returned to you after running that command, it will be needed during next step.

7.3 Create AWS Lambda function

Create deployment function with the script below:

$ aws lambda create-function --function-name demo \
  --role <specify role arn from step 5.2 here> \
  --runtime provided --timeout 30 --memory-size 1024 \
  --handler torchlambda --zip-file fileb://torchlambda.zip

7.4 Create AWS Layer containing model

We already have our ResNet18 packed appropriately so run the following to make a layer from it:

$ aws lambda publish-layer-version --layer-name model \
  --description "Resnet18 neural network model" \
  --license-info "MIT" \
  --zip-file fileb://model.zip

Please save the LayerVersionArn just like in 6.2 and insert below to add this layer to function from previous step:

$ aws lambda update-function-configuration \
  --function-name demo \
  --layers <specify layer arn from above here>

This configures whole deployment, now we our model is ready to get incoming requests.

8. Send payload to Lambda

In this final step you will send payload.json (created in step 6) to our AWS Lambda function and check whether we get a correct response.

Simply run from CLI:

aws lambda invoke --function-name demo --payload file://payload.json output.json

You should get the following response in output.json (your label may vary as image and neural network weights are random):

cat output.txt
  {"label": 40}

Congratulations, you have deployed ResNet18 classifier using only AWS Lambda in a few simple steps!