I have deployed the background removal model( Pytorch- pre-trained u2net) in aws using lambda and EFS file system and APIgetway. I have stored my model in efs and loading to the lambda. the model is around 170MB. The API getaway response time is around 32 seconds. is it any way to speed up the response time?
We are building an ML pipeline on AWS, which will obviously require some heavy-compute components including preprocessing and batch training. Most the the pipeline is on Lambda, but Lambda is known to have time limits on how long a job can be run (~15mins). Thus for the longer running jobs like batch training of ML models we will need(?) to access something like EC2 instances. For example a lambda function could be invoked and then kick off an EC2 instance …
Serverless technology can be used to deploy ML models to production, since the deployment package sizes can be compressed if too large (or built from source with unneeded dependencies stripped). But there is also the use case of deploying ML for training, not just inference. For example, if a company wanted to allow power users to retrain a model from the front-end. Is this feasible for Lambda given the long training times? Whereas latency wouldn't be issue (cold start delay …
I made a front end where I would like to make REST calls to an AWS Lambda interfaced with AWS API Gateway. I dumped my model as a pickle file (and so my encoders) which I initially trained locally. I then stored these files in a S3 bucket. The problem is that I cannot import libraries such as pandas and sklearn to make model predictions because the lambda console is unable to find them. Does anyone have any suggestions to …