1

We have followed the following steps:

  1. Trained 5 TensorFlow models in local machine using 5 different training sets.
  2. Saved those in .h5 format.
  3. Converted those into tar.gz (Model1.tar.gz,...Model5.tar.gz) and uploaded it in the S3 bucket.
  4. Successfully deployed a single model in an endpoint using the following code:
from sagemaker.tensorflow import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = tarS3Path + 'model{}.tar.gz'.format(1),
                                  role = role, framework_version='1.13',
                                  sagemaker_session = sagemaker_session)
predictor = sagemaker_model.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')
predictor.predict(data.values[:,0:])

The output was: {'predictions': [[153.55], [79.8196], [45.2843]]}

Now the problem is that we cannot use 5 different deploy statements and create 5 different endpoints for 5 models. For this we followed two approaches:

i) Used MultiDataModal of Sagemaker

from sagemaker.multidatamodel import MultiDataModel
sagemaker_model1 = MultiDataModel(name = "laneMultiModels", model_data_prefix = tarS3Path,
                                 model=sagemaker_model, #This is the same sagemaker_model which is trained above
                                  #role = role, #framework_version='1.13',
                                  sagemaker_session = sagemaker_session)
predictor = sagemaker_model1.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')
predictor.predict(data.values[:,0:], target_model='model{}.tar.gz'.format(1))

Here we got an error at deploy stage which is as follows: An error occurred (ValidationException) when calling the CreateModel operation: Your Ecr Image 763104351884.dkr.ecr.us-east-2.amazonaws.com/tensorflow-inference:1.13-cpu does not contain required com.amazonaws.sagemaker.capabilities.multi-models=true Docker label(s).

ii) Created endpoint manually

import boto3
import botocore
import sagemaker
sm_client = boto3.client('sagemaker')
image = sagemaker.image_uris.retrieve('knn','us-east-2')
container = {
    "Image": image,
    "ModelDataUrl": tarS3Path,
    "Mode": "MultiModel"
}
# Note if I replace "knn" by tensorflow it gives an error at this stage itself
response = sm_client.create_model(
              ModelName        = 'multiple-tar-models',
              ExecutionRoleArn = role,
              Containers       = [container])
response = sm_client.create_endpoint_config(
    EndpointConfigName = 'multiple-tar-models-endpointconfig',
    ProductionVariants=[{
        'InstanceType':        'ml.t2.medium',
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName':            'multiple-tar-models',
        'VariantName':          'AllTraffic'}])
response = sm_client.create_endpoint(
              EndpointName       = 'tarmodels-endpoint',
              EndpointConfigName = 'multiple-tar-models-endpointconfig')

Endpoint couldn't be created in this approach as well.

Subh2608
  • 13
  • 1
  • 4

2 Answers2

2

I also have been looking for answers regarding this before, and after several days of trying with my friend, we manage to do it. I attach some code snippet that we use, you may modify it according to your use case

image = '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.2.0-cpu'
container = { 
    'Image': image,
    'ModelDataUrl': model_data_location,
    'Mode': 'MultiModel'
}

sagemaker_client = boto3.client('sagemaker')

Create Model

response = sagemaker_client.create_model( ModelName = model_name, ExecutionRoleArn = role, Containers = [container])

Create Endpoint Configuration

response = sagemaker_client.create_endpoint_config( EndpointConfigName = endpoint_configuration_name, ProductionVariants=[{ 'InstanceType': 'ml.t2.medium', 'InitialInstanceCount': 1, 'InitialVariantWeight': 1, 'ModelName': model_name, 'VariantName': 'AllTraffic'}])

Create Endpoint

response = sagemaker_client.create_endpoint( EndpointName = endpoint_name, EndpointConfigName = endpoint_configuration_name)

Invoke Endpoint

sagemaker_runtime_client = boto3.client('sagemaker-runtime')

content_type = "application/json" # The MIME type of the input data in the request body. accept = "application/json" # The desired MIME type of the inference in the response. payload = json.dumps({"instances": [1.0, 2.0, 5.0]}) # Payload for inference. target_model = 'model1.tar.gz'

response = sagemaker_runtime_client.invoke_endpoint( EndpointName=endpoint_name, ContentType=content_type, Accept=accept, Body=payload, TargetModel=target_model, )

response

also, make sure your model tar.gz files have this structure

└── model1.tar.gz
     └── <version number>
         ├── saved_model.pb
         └── variables
            └── ...

more info regarding this

0

Simply deploy a multi-model endpoint.

https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html