Alternative to EC2 for running ML batch training jobs on AWS

We are building an ML pipeline on AWS, which will obviously require some heavy-compute components including preprocessing and batch training. Most the the pipeline is on Lambda, but Lambda is known to have time limits on how long a job can be run (~15mins). Thus for the longer running jobs like batch training of ML models we will need(?) to access something like EC2 instances. For example a lambda function could be invoked and then kick off an EC2 instance …
Category: Data Science

Deploying ML/Deep Learning on AWS Lambda for Long-Running Training, not just Inference

Serverless technology can be used to deploy ML models to production, since the deployment package sizes can be compressed if too large (or built from source with unneeded dependencies stripped). But there is also the use case of deploying ML for training, not just inference. For example, if a company wanted to allow power users to retrain a model from the front-end. Is this feasible for Lambda given the long training times? Whereas latency wouldn't be issue (cold start delay …
Category: Data Science

sklearn and pandas in AWS Lambda

I made a front end where I would like to make REST calls to an AWS Lambda interfaced with AWS API Gateway. I dumped my model as a pickle file (and so my encoders) which I initially trained locally. I then stored these files in a S3 bucket. The problem is that I cannot import libraries such as pandas and sklearn to make model predictions because the lambda console is unable to find them. Does anyone have any suggestions to …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.