For 429 errors, do you consider adding a setting or change the default behaviour to allow retries?
From what I have seen, when there is a 429 error (rate limit), Crawlee tries rotating the session, but if all of the sessions it uses return 429 (for example all proxies have been rate-limited) the request is just marked as handled and forgotten. I would expect this to be considered a transient error that can be fixed with time, same as a 500 error. Do you expect to support this change in behaviour in the future?
4 Replies
Hey @Eric
You can change the behavior, for example by placing such requests in a different queue so that they can be processed later. You may find this guide useful - https://crawlee.dev/python/docs/guides/error-handling
Error handling | Crawlee for Python · Fast, reliable Python web cr...
Crawlee helps you build and maintain your Python crawlers. It's open source and modern, with type hints for Python to help you catch bugs early.
nice! thanks!
@Mantisus do you have something for JS version?
@memo23 Unfortunately, I cannot recommend anything similar for the JS version. Ask in the relevant channel.