[Project] Training models and running Jupyter Notebooks on AWS Spot Instances (cheaper and simpler than SageMaker)

Written by torontoai on April 15, 2019. Posted in Reddit MachineLearning.

Hi everyone,

I’ve developed a tool to simplify training of deep learning models on AWS: https://github.com/apls777/spotty. My goal was to make training on AWS GPU instances as simple as training on a local computer. Spotty automatically manages all necessary AWS resources (AMIs, volumes, snapshots, SSH keys), runs Spot Instances to save up to 70% of the costs and uses tmux to easily detach remote processes from their SSH sessions.

To train the model (and make it trainable by everyone with a couple of commands), you just need to create 1 configuration file, where you describe a Docker container and AWS instance parameters.

Then the workflow is super-simple:

Use the “spotty start” command to start your container on a cheap AWS Spot Instance. Your local project will be uploaded to the instance and available inside the container.
Once the instance is up and running, use the “spotty ssh” command to connect to the container, or start Jupyter Notebook using the “spotty run jupyter” command (it’s a custom script from the configuration file).

Here is an article on how to train a model using Spotty with a real-life example: https://towardsdatascience.com/how-to-train-deep-learning-models-on-aws-spot-instances-using-spotty-8d9e0543d365.

I really hope you will find this tool useful and will be happy to hear any feedback.

submitted by /u/apls777
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[Project] Training models and running Jupyter Notebooks on AWS Spot Instances (cheaper and simpler than SageMaker)