[D] Training on the cloud: GCP GPU pricing seems dramatically cheaper, why would you train on AWS or Azure?

Written by torontoai on May 10, 2019. Posted in Reddit MachineLearning.

Like the title says, looking at the cost of entry-level GPU instances on the major clouds:

AWS: p2.xlarge — 1 Tesla K80, 4 vCPUs, 61gb ram $0.900/hr

Azure: NC6 — 1 Tesla K80, 6 vCPUs, 56gb ram $0.900/hr

GCP: 1 Tesla K80, 6 vCPUs, 52gb ram $0.663/hr

Further, for training CNNs on the K80 I never exceed 4-5gb of memory usage or reasonable utilization of 4 vCPUs. Since GCP is the only cloud that gives me ability to finely tune specs I can even further decrease cost for ML applications. For example:

GCP: 1 Tesla K80, 4 vCPUs, 5gb ram $0.424/hr

When benchmarking resnet50, this cheaper configuration provides no performance decrease compared to the more expensive instance.

Perhaps spot instance pricing (low-priority for azure, preemptible for GCP) comes into play, where GCP is in the middle of the pack:

AWS: p2.xlarge — 1 Tesla K80, 4 vCPUs, 61gb ram $0.270/hr

Azure: NC6 — 1 Tesla K80, 6 vCPUs, 56gb ram $0.180/hr

GCP: 1 Tesla K80, 6 vCPUs, 52gb ram $0.236/hr

This kind of instance, however, does not work for every use case so the regular on-demand pricing difference is still significant.

This all leaves me wondering:

If you train your models on the cloud, which provider do you use?

Can you imagine any reasons/use cases/etc that might warrant picking a provider other than GCP?

What is GCPs business model? How can they make money selling for so much less? Is this a loss leader to gain market share?

submitted by /u/Obventio
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] Training on the cloud: GCP GPU pricing seems dramatically cheaper, why would you train on AWS or Azure?