Casper
Casper4y ago

Resume crawler based on request queues from previous run locally and in apify

Is it possible to stop a crawler and resume it from the previous run's request queues? I have a crawler that has run for a couple hours locally and I would like to add proxies to it to speed up processing speed because I am getting throttled by using 1 IP, but without starting from scratch because it will be unnecessary and a waste of time. I want to use my existing request queues. Is this possible? Also is this possible on Apify?
4 Replies
xenial-black
xenial-black4y ago
Use a named request queue instead of an unnamed one. It is persisted. The default request queue is unnamed and is tied to the actor's run by default
Casper
CasperOP4y ago
Thanks
Lukas Krivka
Lukas Krivka4y ago
You can also do it by canceling the process and then starting but without the storage purge crawlee run --no-purge. We are also figuring out graceful abort - https://github.com/apify/crawlee/issues/1531
GitHub
Graceful abort of the runtime process (emulation of Apify platform ...
Motivation Apify platform provides a nice feature to gracefully abort an actor run. Instead of exiting the process right away, a user can choose to abort gracefully which makes the Apify platform: ...
Casper
CasperOP4y ago
Thanks

Did you find this page helpful?