sklearn and pandas in AWS Lambda

I made a front end where I would like to make REST calls to an AWS Lambda interfaced with AWS API Gateway.

I dumped my model as a pickle file (and so my encoders) which I initially trained locally. I then stored these files in a S3 bucket.

The problem is that I cannot import libraries such as pandas and sklearn to make model predictions because the lambda console is unable to find them.

Does anyone have any suggestions to help solve this issue?

Topic aws-lambda scikit-learn aws pandas

Category Data Science


Using layers will allow the dependency to be more reusable and potentially easier to maintain and deploy.

The short version is:

  1. create a requirements.txt from pip freeze or similar that looks like this:
pandas==0.23.4
pytz==2018.7
  1. create get_layer_packages.sh bash script to be run by docker which looks like this:
#!/bin/bash

export PKG_DIR="python"

rm -rf ${PKG_DIR} && mkdir -p ${PKG_DIR}

docker run --rm -v $(pwd):/foo -w /foo lambci/lambda:build-python3.6 \
    pip install -r requirements.txt --no-deps -t ${PKG_DIR}
  1. Run the stuff from above like this in the terminal:
chmod +x get_layer_packages.sh
./get_layer_packages.sh
zip -r my-Python36-Pandas23.zip .

I'm not so experienced with python and spent a decent chunk of time messing around with zipping pandas and virtual envs up, and have never really used docker for anything IRL, but this process is actually far more accessible (and better documented) than the venv > zip > upload process I was using before.


You need to create a deployment package which includes the packages you want to use in Lambda (sklearn and pandas).

You can then either upload that deployment package to S3 and import it in the Lambda function, or upload it within the Lambda function itself.

The Lambda function code will have to be written outside of AWS Lambda and be included in the deployment package. Here's a guide on how to do it.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.