Intermittent type error when running CNN inside Docker container

Question

Intermittent type error when running CNN inside Docker container

Jeremiah

2022年3月7日 03:06

I've been working on a simple computer vision API with a few endpoints for grabbing useful information from ebay images. The API lives in a docker container that looks for the h5 files, downloads them if they aren't there, spins everything up and starts the server whenever the container is built. One of the endpoints is a transfer trained VGG-16 classifier that divides images into buckets for further analysis and human review. This endpoint is exhibiting some odd behavior. When I run it straight from my local machine, it works fine. However, when I run it from inside the docker container, it will work fine so long as it doesn't sit idle for too long, but if it does I will get the following error and the docker container will have to be restarted:

Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(3, 3, 512, 512), dtype=float32) is not an element of this graph.

There doesn't seem to be an error in the code and the docker looks fine. I don't understand why this only happens in the containerized api and not the locally stored version. If anyone has seen this before please help!

Here is my Dockerfile

FROM python:3.7.3
WORKDIR /code
COPY requirements.txt /code
RUN pip install -r requirements.txt
RUN apt update  apt-get install tesseract-ocr -y
COPY . /code
EXPOSE 8000
CMD ["python", "-u", "manage.py", "runserver", "0.0.0.0:8000"]

The manage.py that manages the relevant h5 file:

if __name__ == '__main__':
    model_getter = os.listdir(".")
    model_list = []

    for names in model_getter:
        if names.endswith(".h5"):
            model_list.append(names)

    if model_list==[]:   
        sys.stdout.write("No neural network found.  Downloading model.\n")   
        if os.path.exists("../aws-credentials.json"):
            with open("../aws-credentials.json") as f:
                sys.stdout.write("Loading developer credentials.\n")
                data= json.load(f)
                conn = boto.connect_s3(data["accessKeyId"], data["secretAccessKey"])

        bucket = conn.get_bucket('mybucket')
        file_path = bucket.get_key('/developer/model.h5')
        sys.stdout.write("model is preparing to load.  Please wait for download to complete. This may take some time.\n")
        file_path.get_contents_to_filename('model.h5')
        sys.stdout.write("Download Complete.\n")
    main()

and the CNN code itself:

def runNetwork(request, encoded_url):
    from keras.preprocessing import image
    from keras.applications.vgg16 import preprocess_input
    import keras.models
    import numpy as np
    import requests
    from io import BytesIO

    model=keras.models.load_model('model.h5')
    uri = urllib.parse.unquote(encoded_url)
    response = requests.get(uri)
    img = image.load_img((BytesIO(response.content)),target_size=(224, 224))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)

    features = model.predict_classes(x)
    if features ==[0]:
        code = "foo"
    if features ==[1]:
        code = "bar"
    if features==[2]:
        code = "baz"
    ...

    features=np.int(features[0])   
    obj_json = {"input_uri" : uri, "prediction_index" : features, "prediction_code" : code }
    return JsonResponse(obj_json)

and the full stack trace:

usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py in _run
                subfeed, allow_tensor=True, allow_operation=False) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in as_graph_element
      return self._as_graph_element_locked(obj, allow_tensor, allow_operation) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in _as_graph_element_locked
        raise ValueError("Tensor %s is not an element of this graph." % obj) …
▶ Local vars
During handling of the above exception (Tensor Tensor("Placeholder:0", shape=(3, 3, 512, 512), dtype=float32) is not an element of this graph.), another exception occurred:
/usr/local/lib/python3.7/site-packages/django/core/handlers/exception.py in inner
            response = get_response(request) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/django/core/handlers/base.py in _get_response
                response = self.process_exception_by_middleware(e, request) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/django/core/handlers/base.py in _get_response
                response = wrapped_callback(request, *callback_args, **callback_kwargs) …
▶ Local vars
/code/SlabIDs/views.py in runNetwork
    model=keras.models.load_model('SlabNet3.h5') …
▶ Local vars
/usr/local/lib/python3.7/site-packages/keras/engine/saving.py in load_model
        model = _deserialize_model(f, custom_objects, compile) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/keras/engine/saving.py in _deserialize_model
    K.batch_set_value(weight_value_tuples) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in batch_set_value
        get_session().run(assign_ops, feed_dict=feed_dict) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py in run
                         run_metadata_ptr) …
▶ Local vars
/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py in _run
                'Cannot interpret feed_dict key as Tensor: ' + e.args[0]) …
▶ Local vars

Topic keras tensorflow aws python

Category Data Science

Jeremiah · Accepted Answer · 2019年6月13日 21:51

Ok, I made a dumb mistake. Please disregard the above panicked ranting. In case anyone else bumps up against something like this however, the issue was resolved just by remembering to clear my session at the end of each run:

from keras import backend as K
...code...
K.clear_session()

Intermittent type error when running CNN inside Docker container

About