Requests timing out - best practices?
I have a huge number of URLs to use as starting points for my scrape. And I am initiating the scrape with something like this: (Note: startUrls is an array containing several hundred URLs)
Then, in each callback of router.addDefaultHandler, I further scroll through each page, enqueuing more links. So, what i'm trying to do is quite extensive and I expect the scrape to take many hours.
When I run my scraper, it works well up to a point, but then I start getting more and more errors like:
And eventually, the entire thing grinds to a halt with something like:
[To be continued...]
