'Python-asyncio and subprocess deployment on IIS: returning HTTP response without running another script completely

I'm facing an issue in creating Realtime status update for merging new datasets with old one and machine learning model creation results via Web framework. The tasks are simple in following steps.

  1. An user/ client will send a new datasets in .CSV file to the server,
  2. On server side my windows machine will receive a file then send an acknowledge,
  3. Merge the new dataset with the old one for new machine learning model creation and
  4. Run another python script(that is to create a new sequential deep-learning model). After the successful completion of another python script my code have to return the response to the client!

I have deployed my python-flask application on IIS-10. To run an another python script, this main flask-api script should have to wait for completing that model creation script. On model creation python script it contains several process like loading datasets, tokenizing, oneHot Encoding, padding techniques, model training for 100 epochs and finally prediction results.

My exact goal is this Flask-API should have to wait for until completing the entire process. I'm sure definitely it will take 8-9 minutes to complete the whole script mentioned in subprocess.run(). While testing this code on development mode it's working excellently without any issues! But while testing it on production mode on IIS no it's not waiting for the whole process and within 6-7 seconds it returning response to the client.

For debugging purpose I included logging to record all events in both Flask script and machine learning model creation script! Through that I came to understand that model creation script only ran 10%!. First I tried simple methods with async def and await to run the subprocess.run() it didn't make any sense! Then I included threading and get_event_loop() and then run_until_complete() to make my parent code wait until finishing the whole process. But finally I'm helpless!! I couldn't able to find a rightful solution. Please let me know what I did wrong.. Thank you.

Configurations:

  1. Python 3.7.9
  2. Windows server 2019 and
  3. IIS 10.0 Express

My code:

import os
import time
import glob
import subprocess
import pandas as pd
from flask import Flask, request, jsonify
from werkzeug.utils import secure_filename
from datetime import datetime
import logging
import asyncio
from concurrent.futures import ThreadPoolExecutor

ALLOWED_EXTENSIONS = {'csv', 'xlsx'}
_executor = ThreadPoolExecutor(1)

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = "C:\\inetpub\\wwwroot\\iAssist_IT_support\\New_IT_support_datasets"
currentDateTime = datetime.now()
filenames = None

logger = logging.getLogger(__name__)
app.logger.setLevel(logging.DEBUG)

formatter = logging.Formatter('%(asctime)s:%(name)s:%(message)s')

file_handler = logging.FileHandler('model-creation-status.log')
file_handler.setFormatter(formatter)

# stream_handler = logging.StreamHandler()
# stream_handler.setFormatter(formatter)

app.logger.addHandler(file_handler)


# app.logger.addHandler(stream_handler)


def allowed_file(filename):
    return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS


@app.route('/file_upload')
def home():
    return jsonify("Hello, This is a file-upload API, To send the file, use http://13.213.81.139/file_upload/send_file")


@app.route('/file_upload/status1', methods=['POST'])
def upload_file():
    app.logger.debug("/file_upload/status1 is execution")
    # check if the post request has the file part
    if 'file' not in request.files:
        app.logger.debug("No file part in the request")
        response = jsonify({'message': 'No file part in the request'})
        response.status_code = 400
        return response
    file = request.files['file']
    if file.filename == '':
        app.logger.debug("No file selected for uploading")
        response = jsonify({'message': 'No file selected for uploading'})
        response.status_code = 400
        return response
    if file and allowed_file(file.filename):
        filename = secure_filename(file.filename)
        file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
        print(filename)
        print(file)
        app.logger.debug("Spreadsheet received successfully")
        response = jsonify({'message': 'Spreadsheet uploaded successfully'})
        response.status_code = 201
        return response
    else:
        app.logger.debug("Allowed file types are csv or xlsx")
        response = jsonify({'message': 'Allowed file types are csv or xlsx'})
        response.status_code = 400
        return response


@app.route('/file_upload/status2', methods=['POST'])
def status1():
    global filenames
    app.logger.debug("file_upload/status2 route is executed")
    if request.method == 'POST':
        # Get data in json format
        if request.get_json():
            filenames = request.get_json()
            app.logger.debug(filenames)
            filenames = filenames['data']
            # print(filenames)
            folderpath = glob.glob('C:\\inetpub\\wwwroot\\iAssist_IT_support\\New_IT_support_datasets\\*.csv')
            latest_file = max(folderpath, key=os.path.getctime)
            # print(latest_file)
            time.sleep(3)
            if filenames in latest_file:
                df1 = pd.read_csv("C:\\inetpub\\wwwroot\\iAssist_IT_support\\New_IT_support_datasets\\" +
                                  filenames, names=["errors", "solutions"])
                df1 = df1.drop(0)
                # print(df1.head())
                df2 = pd.read_csv("C:\\inetpub\\wwwroot\\iAssist_IT_support\\existing_tickets.csv",
                                  names=["errors", "solutions"])
                combined_csv = pd.concat([df2, df1])
                combined_csv.to_csv("C:\\inetpub\\wwwroot\\iAssist_IT_support\\new_tickets-chatdataset.csv",
                                    index=False, encoding='utf-8-sig')
                time.sleep(2)
                # return redirect('/file_upload/status2')
    return jsonify('New data merged with existing datasets')


@app.route('/file_upload/status3', methods=['POST'])
def status2():
    app.logger.debug("file_upload/status3 route is executed")
    if request.method == 'POST':
        # Get data in json format
        if request.get_json():
            message = request.get_json()
            message = message['data']
            app.logger.debug(message)
    return jsonify("New model training is in progress don't upload new file")


@app.route('/file_upload/status4', methods=['POST'])
def model_creation():
    app.logger.debug("file_upload/status4 route is executed")
    if request.method == 'POST':
        # Get data in json format
        if request.get_json():
            message = request.get_json()
            message = message['data']
            app.logger.debug(message)
            app.logger.debug(currentDateTime)

            def model_run():
                app.logger.debug("model script starts to run")
                subprocess.run("python C:\\.....\\IT_support_chatbot-master\\"
                               "Python_files\\main.py", shell=True)
                # time.sleep(20)
                app.logger.debug("script ran successfully")

            async def subprocess_call():
                # run blocking function in another thread,
                # and wait for it's result:
                app.logger.debug("sub function execution starts")
                await loop.run_in_executor(_executor, model_run)

            asyncio.set_event_loop(asyncio.SelectorEventLoop())
            loop = asyncio.get_event_loop()
            loop.run_until_complete(subprocess_call())
            loop.close()
    return jsonify("Model created successfully for sent file %s" % filenames)


if __name__ == "__main__":
    app.run()


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source