inland-turquoise
inland-turquoise12mo ago

Prevent automatic reclaim of failed requests

Hi everyone! Hope you're all doing well. I have a small question about Crawlee. My use case is a little simpler than a crawler; I just want to scrape a single URL every few seconds. To do this, I create a RequestList with just one url and start the Crawler. Sometimes, the crawler returns HTTP errors and fails. However, I don't mind as I'm going to run the crawler again after a few seconds and I'd prefer the errors to be ignored rather than automatically reclaimed. Is there a way of doing this?
4 Replies
Hall
Hall12mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
quickest-silver
quickest-silver12mo ago
You can simply set the maxRequestRetries option to 0:
const crawler = new BasicCrawler({
maxRequestRetries: 0,
...
});
const crawler = new BasicCrawler({
maxRequestRetries: 0,
...
});
inland-turquoise
inland-turquoiseOP12mo ago
Maybe I misunderstood how the lib works, but wouldn't that just make the request go to failed status faster? Correct me if I'm wrong, but what I understood is: - Url is added to requests; - If the request fails, it is retried up to maxRequestRetries times; - If it still fails, it is marked as failed and can be reclaimed.
Oleg V.
Oleg V.12mo ago
I guess, You can use noRetry option: https://crawlee.dev/api/next/core/class/Request#noRetry

Did you find this page helpful?