What is the benefit of training an ML model with an AWS SageMaker Estimator?

It looks like there are different routes to deploying an ML model on SageMaker. You can:

  • pre-train a model, create a deployment archive, then deploy

  • create an estimator, train the model on SageMaker with a script, then deploy

My question is: are there benefits of taking the second approach? To me, it seems like writing a training script would require a bit of trial and error and perhaps some extra work to package it all up neatly. Why not just train a model running cells sequentially in a jupyter notebook where I can track each step, and then go with the first approach?

Does anyone have experience and can compare/contrast these approaches?

Topic sagemaker aws machine-learning

Category Data Science


The approach one looks best when you are training a small model or one which doesn't take much compute time but when training large model it's always prefered to train via second approach.

Reasons are as follows:

  • When training large model you might require distributed training either Data or Model parallel.
  • When running large model it's best practise to train via second approach as if the training stops abruptly, job gets stopped and you will not be charged which is not the case when you run via notebook instance.

So,that extra works pays off as it help train faster and saves your bill!!

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.