Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[Project]Deploy trained model to AWS lambda with Serverless framework

Hi guys,

We have continue updating our open source project for packaging and deploying ML models to product (, and we have create an easy way to deploy ML model as a serverless ( project that you could easily deploy to AWS lambda and Google Cloud Function. We want to share with you guys about it and hear your feedback.


A little background of BentoML for those aren’t familiar with it. BentoML is a python library for packaging and deploying machine learning models. It provides high-level APIs for defining a ML service and packaging its artifacts, source code, dependencies, and configurations into a production-system-friendly format that is ready for deployment.

Feature highlights: * Multiple Distribution Format – Easily package your Machine Learning models into format that works best with your inference scenario: – Docker Image – deploy as containers running REST API Server – PyPI Package – integrate into your python applications seamlessly – CLI tool – put your model into Airflow DAG or CI/CD pipeline – Spark UDF – run batch serving on large dataset with Spark – Serverless Function – host your model with serverless cloud platforms

  • Multiple Framework Support – BentoML supports a wide range of ML frameworks out-of-the-box including Tensorflow, PyTorch, Scikit-Learn, xgboost and can be easily extended to work with new or custom frameworks.

  • Deploy Anywhere – BentoML bundled ML service can be easily deploy with platforms such as Docker, Kubernetes, Serverless, Airflow and Clipper, on cloud platforms including AWS Lambda/ECS/SageMaker, Gogole Cloud Functions, and Azure ML.

  • Custom Runtime Backend – Easily integrate your python preprocessing code with high-performance deep learning model runtime backend (such as tensorflow-serving) to deploy low-latancy serving endpoint.


How to package machine learning model as serverless project with BentoML

It’s surprising easy, just with a single CLI command. After you finished training your model and saved it to file system with BentoML. All you need to do now is run bentoml build-serverless-archive command, for example:

 $bentoml build-serverless-archive /path_to_bentoml_archive /path_to_generated_serverless_project --platform=[aws-python, aws-python3, google-python] 

This will generate a serverless project at the specified directory. Let’s take a look of what files are generated.

 /path_to_generated_serverless_project - serverless.yml - requirements.txt - copy_of_bentoml_archive/ - (if platform is google-python, it will generate 

serverless.yml is the configuration file for serverless framework. It contains configuration to the cloud provider you are deploying to, and map out what events will trigger what function. BentoML automatically modifies this file to add your model prediction as a function event and update other info for you.

requirements.txt is a copy from your model archive, it includes all of the dependencies to run your model is the file that contains your function code. BentoML fill this file’s function with your model archive class, you can make prediction with this file right away without any modifications.

copy_of_bentoml_archive: A copy your model archive. It will be bundle with other files for serverless deployment.


What’s next

After you generate this serverless project. If you have the default configuration for AWS or google. You can deploy it right away. Otherwise, you can update the serverless.yaml based on your own configurations.

Love to hear feedback from you guys on this.





Edit: Styling

submitted by /u/yubozhao
[link] [comments]