extended-salmon
extended-salmon•16mo ago

PlaywrightCrawler actor not finishing requestQueue

I have a playwright Actor that will has 10 URLs added to its queue before i kick it off with .run(). But the actor doesn't finish all 10 URLs. It will process between 4 and 7, then the Log for the run will just show statistics message repeated every second. Note that this happens in my local runs of this Actor as well. The total number of URLs scraped (out of 10) varies from run to run, minimum 1 URL and max 7 (of 10 total). This is the message it shows on repeat, on my local and on Apify platform:
2024-05-22T22:34:24.274Z INFO Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":35781,"requestsFinishedPerMinute":2,"requestsFailedPerMinute":0,"requestTotalDurationMillis":143124,"requestsTotal":4,"crawlerRuntimeMillis":120866,"retryHistogram":[4]}
2024-05-22T22:34:24.301Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":6,"desiredConcurrency":11,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}
2024-05-22T22:34:24.274Z INFO Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":35781,"requestsFinishedPerMinute":2,"requestsFailedPerMinute":0,"requestTotalDurationMillis":143124,"requestsTotal":4,"crawlerRuntimeMillis":120866,"retryHistogram":[4]}
2024-05-22T22:34:24.301Z INFO PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":6,"desiredConcurrency":11,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}
Why would it stop pulling from the requestsQueue? There are no errors in the Actor prior to this.
8 Replies
MEE6
MEE6•16mo ago
@kennysmithnanic just advanced to level 1! Thanks for your contributions! 🎉
ondro_k
ondro_k•16mo ago
Hi, could you share ID (or URL) of your run?
extended-salmon
extended-salmonOP•16mo ago
Apify
Apify Console
Manage the Apify platform and your account.
From An unknown user
From An unknown user
inland-turquoise
inland-turquoise•16mo ago
Hi, did you find out what was wrong? I am having the same issue while using a playwright browser in crawlee.
HonzaS
HonzaS•16mo ago
I have had similar problem with playwright crawler that it finishes and there were still pending requests in the queue. But now this happened. Pending requests = -5 , how can this happen?
No description
extended-salmon
extended-salmonOP•16mo ago
I never figured out why this was happening. I wound up starting my project from scratch again from the base playwright actor provided by apify and I haven’t had this problem again.
broad-brown
broad-brown•16mo ago
These two issues seem unrelated. In the run from kenny, all requests get fetched from the queue, but then the Actor stalls while handling them. To check your issue @HonzaS, we would need more info.
HonzaS
HonzaS•16mo ago
@vojtechmaslan I have shared you the run via private message. Thanks.

Did you find this page helpful?