'Why is my container copying a non existing file?

I am currently trying to deploy and serve a fasttext model for a business venture. I decided to use Google's Vertex AI (if you have a better idea of something to use, please do!). I created a dockerfile and training script to train my model, I built the docker image and then pushed it to the Google Cloud Repository. Here is the code for it : Dockerfile :

FROM python:3.8-slim-buster

RUN apt-get update && apt-get install -y \
        build-essential \
        wget \
        git \
        python-dev \
        unzip \
        python-numpy \
        python-scipy \
        && rm -rf /var/cache/apk/*

RUN wget -nv \
    https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.tar.gz && \
    mkdir /root/tools && \
    tar xvzf google-cloud-sdk.tar.gz -C /root/tools && \
    rm google-cloud-sdk.tar.gz && \
    /root/tools/google-cloud-sdk/install.sh --usage-reporting=false \
        --path-update=false --bash-completion=false \
        --disable-installation-options && \
    rm -rf /root/.config/* && \
    ln -s /root/.config /config && \
    # Remove the backup directory that gcloud creates
    rm -rf /root/tools/google-cloud-sdk/.install/.backup

    # Path configuration
    ENV PATH $PATH:/root/tools/google-cloud-sdk/bin
    # Make sure gsutil will use the default service account
    RUN echo '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg

RUN pip3 install fasttext
RUN pip3 install google
RUN pip3 install google-cloud-storage
RUN pip3 install --upgrade google-api-python-client
RUN pip3 install --upgrade google-cloud
COPY . .

ENTRYPOINT ["python3", "trainer.py"]

Trainer.py :

import fasttext
from google.cloud import storage
import tempfile
from google.cloud.storage import blob
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file('endless-bank-344008-a75f5b89470f.json')
with tempfile.NamedTemporaryFile() as tmp_file:
  local_model_file = tmp_file.name
  remote_model_file = storage.Client('endless-bank-344008', credentials).bucket('bucket2035').blob('cc.en.300.bin')
  remote_model_file.download_to_filename(local_model_file)
  model_1 = fasttext.load_model(local_model_file)
model_1.save_model("plagscan.bin")
target = storage.Client('endless-bank-344008', credentials).bucket('bucket2035').blob('plagscanner.bin')
target.upload_from_filename('plagscan.bin')

This code, works, which is great. I run it in the vertex ai platform, I press create a model, check everything that applies, use a custom container (after selecting the one I created that is now in the google cloud registry), it runs, very cool, no prediction container. It runs, doesn't create a model because there is no prediction container but it runs successfully and in the bucket2035 there is indeed an output file "plagscanne.bin". Then I created a dockerfile flask app thing to serve as a prediction container, here is the dockerfile and the flask app : Dockerfile :

FROM python:3.8-slim-buster

RUN apt-get update && apt-get install -y \
        build-essential \
        wget \
        git \
        python-dev \
        unzip \
        python-numpy \
        python-scipy \
        && rm -rf /var/cache/apk/*

RUN wget -nv \
    https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.tar.gz && \
    mkdir /root/tools && \
    tar xvzf google-cloud-sdk.tar.gz -C /root/tools && \
    rm google-cloud-sdk.tar.gz && \
    /root/tools/google-cloud-sdk/install.sh --usage-reporting=false \
        --path-update=false --bash-completion=false \
        --disable-installation-options && \
    rm -rf /root/.config/* && \
    ln -s /root/.config /config && \
    # Remove the backup directory that gcloud creates
    rm -rf /root/tools/google-cloud-sdk/.install/.backup

    # Path configuration
    ENV PATH $PATH:/root/tools/google-cloud-sdk/bin
    # Make sure gsutil will use the default service account
    RUN echo '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg

RUN pip3 install flask
RUN pip3 install fasttext
RUN pip3 install google
RUN pip3 install google-cloud-storage
RUN pip3 install --upgrade google-api-python-client
RUN pip3 install --upgrade google-cloud
RUN pip3 install simplejson

COPY . .
ENV FLASK_APP=app.py

EXPOSE 8080

CMD flask run --host=0.0.0.0 --port=8080

Flask app :

import fasttext
from google.cloud import storage
import tempfile
from google.cloud.storage import blob
from google.oauth2 import service_account
import json
import os
import simplejson
from flask import Flask, request, Response

a = os.path.join(model_dir, 'plagscanner.bin')
model_1 = fasttext.load_model(a)

app = Flask(__name__)

@app.route("/isalive")
def isalive():
    print("/isalive request")
    status_code = Response(status=200)
    return status_code

# Flask route for predictions
@app.route('/predict',methods=['GET','POST'])
def prediction():
    result = request.get_json(silent=True, force=True)
    data = result['words']
    wordvectors = json.dumps([model_1(x) for x in data])
    return wordvectors

if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0', port=8080)

Now... this should work right? Wrong. I built this container, pushed it to the google cloud registry and it didn't work, bizarely it gave me the error : Training pipeline failed with error message: There are no files under "gs://bucket2035/model" to copy. Very bizarre, so instead I tried a variation of the app.py code, this version instead downloaded the model training file via bucket download :

import fasttext
from google.cloud import storage
import tempfile
from google.cloud.storage import blob
from google.oauth2 import service_account
import json
import os
import simplejson
from flask import Flask, request, Response

credentials = service_account.Credentials.from_service_account_file('endless-bank-344008-a75f5b89470f.json')
with tempfile.NamedTemporaryFile() as tmp_file:
  local_model_file = tmp_file.name
  remote_model_file = storage.Client('endless-bank-344008', credentials).bucket('bucket2035').blob('cc.en.300.bin')
  remote_model_file.download_to_filename(local_model_file)
  model_1 = fasttext.load_model(local_model_file)

app = Flask(__name__)

@app.route("/isalive")
def isalive():
    print("/isalive request")
    status_code = Response(status=200)
    return status_code

# Flask route for predictions
@app.route('/predict',methods=['GET','POST'])
def prediction():
    result = request.get_json(silent=True, force=True)
    data = result['words']
    wordvectors = json.dumps([model_1(x) for x in data])
    return wordvectors

if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0', port=8080)

Here is the full error :

Training pipeline failed with error message: There are no files under "gs://bucket2035/model" to copy.

Now guess what happens! It gives the same error. I don't understand this, what is it I'm trying to copy? Why is it not working? Is there another solution as opposed to Vertex AI I should be using for this very simple thing? What is the meaning of life (lol)? Please help, I've tried many things and none of them work and I kinda think that there must be an easier solution to this problem. Anyways, any help would be appreciated!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source