equal-aqua
equal-aqua•2y ago

Trying to optimize autoscale options

Hello, I am running my scraper on an AWS 8gb cpu, 16gb memory ecs.
maxConcurrency: 200,
maxRequestsPerCrawl: 500,
maxRequestRetries: 2,
requestHandlerTimeoutSecs: 185,
maxConcurrency: 200,
maxRequestsPerCrawl: 500,
maxRequestRetries: 2,
requestHandlerTimeoutSecs: 185,
Right now the avg cpu and mem are both like 88%. Is there anything I can do here to optimize more? I also have CRAWLEE_AVAILABLE_MEMORY_RATIO=.8
5 Replies
other-emerald
other-emerald•2y ago
Hi @bmax, this is a case-by-case thing. It highly depends on scraped sites, whether you are using a browser, browser settings,...
equal-aqua
equal-aquaOP•2y ago
@vojtechmaslan any guide lines?
MEE6
MEE6•2y ago
@bmax just advanced to level 5! Thanks for your contributions! 🎉
equal-aqua
equal-aquaOP•2y ago
also cpu seems to hit 99% no matter what
{"time":"2024-04-11T04:57:31.174Z","level":"INFO","msg":"PuppeteerCrawler:AutoscaledPool: state","scraper":"web","currentConcurrency":18,"desiredConcurrency":17,"systemStatus":{"isSystemIdle":false,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":true,"limitRatio":0.6,"actualRatio":0.736},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}
{"time":"2024-04-11T04:57:31.174Z","level":"INFO","msg":"PuppeteerCrawler:AutoscaledPool: state","scraper":"web","currentConcurrency":18,"desiredConcurrency":17,"systemStatus":{"isSystemIdle":false,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":true,"limitRatio":0.6,"actualRatio":0.736},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}
what does eventLoop overloaded mean? or how come currentConccurency is 18 when I have maxConcurrency at 200 and there are plenty of request?

Did you find this page helpful?