'Restart while loop when selector for next page is clicked
sorry but I am very new to coding and this is my first project, in which i chose to create a simple scraper for eBay using puppeteer. The program takes in the item to be searched and the desired amount of pages to scraped as a command line argument.
However, due to the selector children li:nth-child increasing each time, i had to create an incremental while loop to through said n-th children from 2 to 60.
This works for the first page, and outputs all items desired. However, due to having to scrape 2+ pages, the i value used in the loop must reset at the start of the next page to restart said loop.
I believe the pagination is correct as in headless mode, the second, third, etc. pages are clicked through.
current loop
let currentPage = 1;
while (currentPage <= pagestoScrape) {
let content = await page.evaluate(() => {
let data = [];
let childnumber = 2;
let i = 2;
while (i < 60) {
let itemids = document.querySelectorAll(`#srp-river-results > ul > li:nth-child(${childnumber}) > div`);
itemids.forEach((itemid) => {
let PRODUCT_ID = `#srp-river-results > ul > li:nth-child(${childnumber}) > div > div.s-item__info.clearfix > a > h3`;
let PRODUCT_PRICE = `#srp-river-results > ul > li:nth-child(${childnumber}) > div > div.s-item__info.clearfix > div.s-item__details.clearfix > div:nth-child(1) > span`;
let item = itemid.querySelector(PRODUCT_ID).innerText;
let price = itemid.querySelector(PRODUCT_PRICE).innerText;
data.push({
item,
price,
});
});
childnumber++;
i++;
}
return data;
});
let urls = [];
urls = urls.concat(content);
if (currentPage < pagestoScrape) {
await Promise.all([
await page.click('#srp-river-results > ul > div.srp-river-answer.srp-river-answer--BASIC_PAGINATION_V2 > div.s-pagination > span > span > nav > a.pagination__next.icon-link'),
await page.waitForSelector('#srp-river-results > ul > div.srp-river-answer.srp-river-answer--BASIC_PAGINATION_V2 > div.s-pagination > span > span > nav > a.pagination__next.icon-link'),
])
}
currentPage++;
let jsonData = JSON.stringify(content, null, 1);
fs.writeFileSync(`./output/output${item}.txt`, jsonData);
}
I have tried reavaluating i after the loop, during the promise and changing the loop to a for. I expected this to reset the loop and for the output to contain the results of each page. However, it results in the output of just 1 page, with no error.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
