Apify Discord Mirror

Updated 2 weeks ago

clean way to stop "request queue seems to be stuck for 300.0"

At a glance

The community member is developing a scraper that scrapes a SPA with infinite scrolling. After 300 seconds, the scraper gets a WARN, which spawns another Playwright instance. This is likely because the community member is only handling 1 request and not adding anything to the RequestQueue. The community member asks for a clean way to stop this from happening.

In the comments, another community member suggests increasing the timeout to allow more time for infinite scrolling. The community member who asked the original question thanks the other community member and provides an example of how to increase the timeout: Actor.config.internal_timeout = timedelta(seconds=xxx).

The answer provided is that the community member can increase the timeout to allow more time for infinite scrolling.

A scraper that I am developing, scrapes a SPA with infinite scrolling. This works fine, but after 300 seconds, I get a WARN , which spawns another playwright instance.
This probably happens since I only handle 1 request (I do not add anything to the RequestQueue), from which I just have a while until finished condition is met.

Plain Text
[crawlee.storages._request_queue] WARN  The request queue seems to be stuck for 300.0s, resetting internal state. ({"queue_head_ids_pending": 0, "in_progress": ["tEyKIytjmqjtRvA"]})


What is a clean way to stop this from happening?
Marked as solution
Hi, since the default timeout is 300 seconds (5 minutes), you can increase the timeout to allow more time for infinite scrolling.
View full solution
E
D
A
3 comments
Hi, since the default timeout is 300 seconds (5 minutes), you can increase the timeout to allow more time for infinite scrolling.
Thank you !

For others:
Plain Text
Actor.config.internal_timeout = timedelta(seconds=xxx) 
@DuxSec just advanced to level 1! Thanks for your contributions! πŸŽ‰
Add a reply
Sign up and join the conversation on Discord