boiling-coffee
boiling-coffee2y ago

How to push request queue after click method?

I tried to push a link in the request queue through repeat. However, when I looked at the contents of the queue, the queue count did not change or the expected count did not come out. I think it is duplicated as the crawl proceeds at the same time, and the same link url enters the queue. Is there a solution?
for (let i = 0; i < 5; i++) {
await enqueueLinks({
selector: "li.sa_item>div>div>div.sa_text>a",
label: "DETAIL",
strategy: EnqueueStrategy.All,
})
await page.waitForSelector('div.section_more');
await page.click('div.section_more');
await page.waitForLoadState("domcontentloaded");
const queueInfo = await crawler.requestQueue?.getInfo();
console.log(queueInfo);
}
for (let i = 0; i < 5; i++) {
await enqueueLinks({
selector: "li.sa_item>div>div>div.sa_text>a",
label: "DETAIL",
strategy: EnqueueStrategy.All,
})
await page.waitForSelector('div.section_more');
await page.click('div.section_more');
await page.waitForLoadState("domcontentloaded");
const queueInfo = await crawler.requestQueue?.getInfo();
console.log(queueInfo);
}
2 Replies
Alexey Udovydchenko
crawler.addRequest and use some random https://crawlee.dev/api/next/core/class/Request#uniqueKey to ensure that same URLs will be crawled again
Request | API | Crawlee
Represents a URL to be crawled, optionally including HTTP method, headers, payload and other metadata. The Request object also stores information about errors that occurred during processing of the request. Each Request instance has the uniqueKey property, which can be either specified manually in the constructor or generated automaticall...
boiling-coffee
boiling-coffeeOP2y ago
Thank you so much I solved it

Did you find this page helpful?