'Concurrent.futures returns the same result every time
I am trying to load multiple pages at the same time with ThreadPoolExecutor.
The code is trying to call examplesite.com&pageNo=x multiple times but currently only returns the source code of the last link in the list multiple times. I was expecting results to return the source code of every page in pagelinks list.
Both thingies return the same result.
pool = ThreadPoolExecutor(5)
results = pool.map(load_page, pagelinks)
for result in results:
print(result)
pages.append(result)
for result in executor.map(load_page, pagelinks):
print(result)
This is the longer version of the code
def load_page(link):
browser.get(link)
#wait for necessary elements to load
delay = 30
myElem = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, "sort")))
source_code = BeautifulSoup(browser.page_source, "html.parser")
return source_code
pagelinks = []
pages = []
for i in range(pagecount):
pagelinks.append(url+"&pageNo="+str(i+1))
pool = ThreadPoolExecutor(5)
results = pool.map(load_page, pagelinks)
for result in results:
print(result)
pages.append(result)
for page in pages:
do stuff with page source code
For example, pagelinks has 5 links. example.com&pageNo=1-5
the source code of example.com&pageNo=5 will get appended to pages 5 times instead of every link's source code getting appended once.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
