'Split work between writing output from .exe and reading the output

I have a simple .exe downloaded from the apple app store. It gives real-time updates on crypto prices and their percentage change. I am extracting percentage changes of bitcoin.

I am using a subprocess to extract the output. I am storing the output into four separate text files, then I'm reading the text from the files, extracting the data I need and saving it into a pandas dataframe.

Additionally, I have a 60 second timeout for each .exe run, and I read the file after 70 seconds. I truncate the file by removing its content, and by the time the next file has output in it, I'll read that file also, then truncate and repeat.

I want to know how to split the work between saving the output into a textfile and reading the content then truncate it. For example, I am running a simple thread that should run the .exe and extract the output with write_truncate_output. However, I only have append_output_executable running.

Here's my script:

from subprocess import STDOUT, check_call as x
import os
from multiprocessing import Pool
import time
import re
from collections import defaultdict
import pandas as pd
from multiprocessing import Process

cmd = [r'/Applications/CryptoManiac.app/Contents/MacOS/CryptoManiac']
text_file = ['bitcoin1.txt','bitcoin2.txt','bitcoin3.txt','bitcoin4.txt']

def append_output_executable(cmd):
    while True:
        i = '1234'
        for num in i:
            try: #append the .exe output to multiple files 
                with open(os.devnull, 'rb') as DEVNULL, open('bitcoin{}.txt'.format(num), 'ab') as f:
                    x(cmd,  stdout=f, stderr=STDOUT, timeout=60)
                
            except:
                pass  


def write_truncate_output(text):
    while True:
        time.sleep(70)
        with open(text, 'r+') as f:
            data = f.read()
            f.truncate(0)
            #read and truncate after reading the data
    
            #filter and format
        percentage=re.findall(r'\bpercent_change_24h:\s.*', data)
        value= [x.split(':')[1] for x in percentage]
        key = [x.split(':')[0] for x in percentage]

        #store in dictionary
        percent_dict = defaultdict(list)

        for ke, val in zip(key, value):
            percent_dict[ke].append(val)
            percent_dict['file'].append(text)

        percent_frame = pd.DataFrame(percent_dict)
    
        print(percent_frame)

if __name__ == '__main__':
    for text in text_file:
        execute_process = Process(target = append_output_executable, args=(cmd,))
        output_process = Process(target = write_truncate_output, args=(text,))
        execute_process.start()
        execute_process.join()
        output_process.start()
        output_process.join()
    

this stil just runs the first function.



Solution 1:[1]

I don't know if this answer resolve all your problems because you have few mistakes in program - when you repair one mistake then it still doesn't work because there are other mistakes.


First:

target in Thread and Process needs function's name without () and arguments - (this is called callback) - and later (when you use .start() ) it will use () to run this function inside new Thread or Process

Thread(target=append_output_executable, args=(cmd,))

Process(target=append_output_executable, args=(cmd,))

Second:

Thread and Process runs this function only once so it needs while-loop to run all time. And it can't use return because it ends function.


Third:

.join() block code because it waits for end of Thread or Process and it should be used after starting all threads/processes - usually it is used at the end of program when you want to stop all threads/processes


And small suggestion(s):

You could use global running = True and inside function while running - and later you can set running = False to stop loops in function (and finish functions)


Code could look like this

# ... other imports ...
from threading import Thread

def append_output_executable(cmd):
    while running:
        # ... code ... (without `return`)

def write_truncate_output(text):
    while running:
        # ... code ... (without `return`)
         
# --- main ---

# global variables

running = True

if __name__ == '__main__':
    
    # --- create and start ---
    
    t0 = Thread(target=append_output_executable, args=(cmd,))
    t0.start()
        
    other_threads = []
   
    for text in text_file:
        t = Thread(target=write_truncate_output, args=(text,))
        t.start()
        other_threads.append(t)
        
    # ... other code ...
    
    # --- at the end of program ---
    
    running = False
    
    # --- wait for end of functions ---
    
    t0.join()
    for t in other_threads:
        t.join()

Exactly the same is with Process

(I keep the same names of variables to show that all is the same)

EDIT

Processes doesn't share memory so it would need to use queue to send information running to processes. So this version need changes.

# ... other imports ...
from multiprocessing import Process

def append_output_executable(cmd):
    while running:
        # ... code ... (without `return`)

def write_truncate_output(text):
    while running:
        # ... code ... (without `return`)

# --- main ---

# global variables

running = True

if __name__ == '__main__':

    # --- create and start ---

    t0 = Process(target=append_output_executable, args=(cmd,))
    t0.start()

    other_threads = []

    for text in text_file:
        t = Process(target=write_truncate_output, args=(text,))
        t.start()
        other_threads.append(t)

    # ... other code ...

    # --- at the end of program ---

    running = False

    # --- wait for end of functions ---

    t0.join()
    for t in other_threads:
        t.join()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1