provincial-silver•3y ago
PlaywrightCrawler New Instance unexpected result
Hi guys, I'm new to crawlee.
I wrap the sample code into a function.
Each time the
getAvailableURLs function is called, a new instance of the PlaywrightCrawler class is created and used to crawl the provided URL.
Source Code:
Result:
1st Crawl: INFO PlaywrightCrawler: Terminal status message: Finished! Total 3 requests: 3 succeeded, 0 failed.
2nd Crawl: INFO PlaywrightCrawler: Terminal status message: Finished! Total 0 requests: 0 succeeded, 0 failed.
Expected Result:
1st Crawl: INFO PlaywrightCrawler: Terminal status message: Finished! Total 3 requests: 3 succeeded, 0 failed.
2nd Crawl: INFO PlaywrightCrawler: Terminal status message: Finished! Total 3 requests: 3 succeeded, 0 failed.
Question
How do I achieve expected result and able to customize strategy, maxRequestsPerCrawl and maxRequestRetries by passing in parameters?1 Reply
quickest-silver•3y ago
I's say the problem is that you're using the same request queue as both crawler use the same default queue. So first call is processing the requests, and by the time you have the second call - the queue already have the processed number of requests and thus it just shuts down the crawler.
you could open the queue before creating crawler instance and drop it explicitly after crawler run - in this case the second call should go through