optimistic-gold•4y ago
Parallel crawling
Ho to Parallel crawling in puppeteer crawler.
6 Replies
You can use
"maxConcurrency" option in PuppeteerCrawlerOptions:
https://crawlee.dev/api/next/puppeteer-crawler/interface/PuppeteerCrawlerOptions#maxConcurrency
https://crawlee.dev/docs/next/guides/scaling-crawlers#minconcurrency-and-maxconcurrencyoptimistic-goldOP•4y ago
thank you
evident-indigo•4y ago
I recommend using
desiredConcurrency to boost your starting concurrency. https://crawlee.dev/docs/guides/scaling-crawlers#desiredconcurrencyScaling our crawlers | Crawlee
To infinity and beyond! ...within limits
Based on my experience crawlee is quite efficient at figuring out how much it can scale to without any configuration.
evident-indigo•4y ago
@Casper That is true, but the initial concurrency is quite low, which is why it’s good to use desiredConcurrency. It improves performance a lot, especially for short crawls with a lot of requests
I agree 👍