adverse-sapphire
adverse-sapphire15mo ago

Prevent automatic reclaim of failed requests

Hi everyone! Hope you're all doing well. I have a small question about Crawlee. My use case is a little simpler than a crawler; I just want to scrape a single URL every few seconds. To do this, I create a RequestList with just one url and start the Crawler. Sometimes, the crawler returns HTTP errors and fails. However, I don't mind as I'm going to run the crawler again after a few seconds and I'd prefer the errors to be ignored rather than automatically reclaimed. Is there a way of doing this?
4 Replies
Hall
Hall15mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
adverse-sapphire
adverse-sapphire15mo ago
You can simply set the maxRequestRetries option to 0:
const crawler = new BasicCrawler({
maxRequestRetries: 0,
...
});
const crawler = new BasicCrawler({
maxRequestRetries: 0,
...
});
adverse-sapphire
adverse-sapphireOP15mo ago
Maybe I misunderstood how the lib works, but wouldn't that just make the request go to failed status faster? Correct me if I'm wrong, but what I understood is: - Url is added to requests; - If the request fails, it is retried up to maxRequestRetries times; - If it still fails, it is marked as failed and can be reclaimed.
Oleg V.
Oleg V.15mo ago
I guess, You can use noRetry option: https://crawlee.dev/api/next/core/class/Request#noRetry

Did you find this page helpful?