metropolitan-bronze•2y ago
Maximum urls to crawl from a named request queue
I have created a named request queue, like this:
const namedRequestQueue = await RequestQueue.open('source-name');
Now when the crawler runs, I want it to crawl only 100 urls from this named request.
My objective is to create many named request queues, and execute them one by one and get 100 urls from each request queue.
I cannot use maxRequestsPerCrawl because that will limit the total urls crawled.
How can I do that? maxRequestsPerCrawl is unlimited, but for each requestQueue, crawl only the first 100 urls from the website.
2 Replies
fair-rose•2y ago
i've had all of the same questions
it's really not possible
you'll have to manually control the crawler's queues
The first question would be why to do it like this?
You can do this in many way.
1. run a separate crawler for each queue
2. dynamically switch the queue on the crawler object