inland-turquoise•12mo ago
Prevent automatic reclaim of failed requests
Hi everyone! Hope you're all doing well. I have a small question about Crawlee.
My use case is a little simpler than a crawler; I just want to scrape a single URL every few seconds.
To do this, I create a RequestList with just one url and start the Crawler. Sometimes, the crawler returns HTTP errors and fails. However, I don't mind as I'm going to run the crawler again after a few seconds and I'd prefer the errors to be ignored rather than automatically reclaimed.
Is there a way of doing this?
4 Replies
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
quickest-silver•12mo ago
You can simply set the maxRequestRetries option to 0:
inland-turquoiseOP•12mo ago
Maybe I misunderstood how the lib works, but wouldn't that just make the request go to failed status faster?
Correct me if I'm wrong, but what I understood is:
- Url is added to requests;
- If the request fails, it is retried up to
maxRequestRetries
times;
- If it still fails, it is marked as failed and can be reclaimed.I guess, You can use noRetry option:
https://crawlee.dev/api/next/core/class/Request#noRetry