Deploy trained Keras or TensorFlow models using Amazon SageMaker
Amazon SageMaker makes it easier for any developer or data scientist to build, train, and deploy machine learning (ML) models. While it’s designed to alleviate the undifferentiated heavy lifting from the full life cycle of ML models, Amazon SageMaker’s capabilities can also be used independently of one another; that is, models trained in Amazon SageMaker can be optimized and deployed outside of Amazon SageMaker (or even out of the cloud on mobile or IoT devices at the edge). Conversely, Amazon SageMaker can deploy and host pre-trained models from model zoos, or other members of your team.
In this blog post, we’ll demonstrate how to deploy a trained Keras (TensorFlow or MXNet backend) or TensorFlow model using Amazon SageMaker, taking advantage of Amazon SageMaker deployment capabilities, such as selecting the type and number of instances, performing A/B testing, and Auto Scaling. Auto Scaling clusters are spread across multiple Availability Zones to deliver high performance and high availability.
Your trained model will need to be saved in either the Keras (JSON and weights hdf5) format or the TensorFlow Protobuf format. If you’d like to begin from a sample notebook that supports this blog post, download it here.
For more on training the model on SageMaker and deploying, refer to this notebook on Github.
Step 1. Set up
In the AWS Management Console, go to the Amazon SageMaker console. Choose Notebook Instances, and create a new notebook instance. Upload the current notebook and set the kernel to conda_tensorflow_p36.
The get_execution_role function retrieves the AWS Identity and Access Management (IAM) role you created at the time of creating your notebook instance.
Step 2. Load the Keras model using the JSON and weights file
If you saved your model in the TensorFlow ProtoBuf format, skip to “Step 4. Convert the TensorFlow model to an Amazon SageMaker-readable format.”
Navigate to keras_model from the Jupyter notebook home, and upload your model.json
and model-weights.h5
files (using the “Upload” menu on the Jupyter notebook home). To use a sample model for this exercise download and unzip the files found here, then upload them to keras_model.
Step 4. Convert TensorFlow model to an Amazon SageMaker-readable format
Move the TensorFlow exported model into a directory exportServo. Amazon SageMaker will recognize this as a loadable TensorFlow model. Your directory and file structure should look like this:
Tar the entire directory and upload to Amazon S3
Step 5. Deploy the trained model
The entry_point file train.py
can be an empty Python file. The requirement will be removed at a later date.
Note: You need to update the endpoint in the following command with the endpoint name from the output of the previous cell (INFO:sagemaker:Creating endpoint with name sagemaker-tensorflow-2019-01-29-17-36-55-987).
Step 6. Invoke the endpoint
Invoke the Amazon SageMaker endpoint from the notebook
Invoke the Amazon SageMaker endpoint using a boto3 client
Step 7. Clean up
To avoid incurring unnecessary charges, use the AWS Management Console to delete the resources that you created for this exercise: https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html
Conclusion
In this blog post, we demonstrated deploying a trained Keras or TensorFlow model at scale using Amazon SageMaker, independent of the computing resource used for model training. This gives you the flexibility to use your existing workflows for model training, while easily deploying the trained models to production with all the benefits offered by a managed platform. These benefits include the ability to select the optimal type and number of deployment instances, perform A/B testing, and auto scale. The Auto Scaling clusters of Amazon SageMaker ML instances can be spread across multiple Availability Zones to deliver both high performance and high availability.
About the author
Priya Ponnapalli is a principal data scientist at Amazon ML Solutions Lab, where she helps AWS customers across different industries accelerate their AI and cloud adoption.