'parallelizing a Python function?

I have a function that submits a search job to a REST API, waits for the API to respond, then downloads 2 sets of JSON data, converts the both JSON's into Pandas dataframes, and returns both dataframes. below is a very simplified version of the function(minus error handling, logging, data scrubbing, etc...)

def getdata(searchstring, url, uname, passwd):
  headers = {'content-type': 'application/json'}
  json_data = CreateJSONPayload(searchstring)

  rPOST = requests.post(url, auth=(uname, passwd), data=json_data, headers=headers)
  statusURL = (str(json.loads(rPOST.text)[u'link'][u'href']))

  Processing = True
  while Processing == True:
      rGET = requests.get(statusURL, auth=(uname, passwd))
      if rGET.status_code== 200:
          url1 = url + "/dataset1"
          url2 = url + "/dataset2"
          rGET1 = requests.get(url1, auth=(uname, passwd))
          rGET2 = requests.get(url2, auth=(uname, passwd))
          dfData1 = pd.read_json(rGET1.text)
          dfData2 = pd.read_json(rGET2.text)
          Processing = False
      elif StatusCode == "Other return code handling":
          print("handle errors") # Not relevant to question.
      else:
          sleep(15)

  return dfData1, dfData2

The function itself works as expected. However the API being called can take anywhere from a couple of minutes to an hour to return the data depending on the parameters I pass to it and I need to submit multiple searches to it, so I'd rather not submit each search one after the other.

What's the best way to parallelize the calling of a function like this so that I can submit multiple requests to it at the same time, wait for all calls of the function have returned data, and finally continue on with data processing in the script?

I also need to be able to throttle the requests too, as the API rate limits me to no more than 15 concurrent connections at a time.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'parallelizing a Python function?

Sources

Related Questions