Concurrency Settings vs Autoscaling Pool
I am really curious about what I configure and what I see.
I am deploying it on a beefy EC2 with the following settings:
concurrency_settings = ConcurrencySettings(
min_concurrency=10,
max_concurrency=100,
)
But my autoscaling pool tells me: [crawlee._autoscaling.autoscaled_pool] INFO current_concurrency = 0; desired_concurrency = 10; cpu = 0.0; mem = 0.0; event_loop = 0.212; client_info = 0.0
Using playwright crawler with a curl impersonate http client:
return PlaywrightCrawler(
request_handler=router,
request_handler_timeout=timeout,
max_request_retries=config.max_retries,
concurrency_settings=concurrency_settings,
http_client=http_client
)
Is there any hints on optimizing for concurrency?
I am deploying it on a beefy EC2 with the following settings:
concurrency_settings = ConcurrencySettings(
min_concurrency=10,
max_concurrency=100,
)
But my autoscaling pool tells me: [crawlee._autoscaling.autoscaled_pool] INFO current_concurrency = 0; desired_concurrency = 10; cpu = 0.0; mem = 0.0; event_loop = 0.212; client_info = 0.0
Using playwright crawler with a curl impersonate http client:
return PlaywrightCrawler(
request_handler=router,
request_handler_timeout=timeout,
max_request_retries=config.max_retries,
concurrency_settings=concurrency_settings,
http_client=http_client
)
Is there any hints on optimizing for concurrency?