correct-apricot
correct-apricot•2y ago

skipNavigation per route label, instead of manually adding it to each request with given label

Use case: When using Cheerio, JSDOM, LinkeDOM crawlers and their routers. I often wanna automatically request+parse all the route handlers except one. ATM I have to remember to specify skipNavigation at every point of adding the request to request queue. (IIUC) Just food for thought, not urgent 🙂
No description
5 Replies
correct-apricot
correct-apricotOP•2y ago
or maybe @HonzaS or @Lukas Krivka have some nice out of the box idea/pattern how to do this? 🥽
Lukas Krivka
Lukas Krivka•2y ago
Hey 🙂 If you know you want to skip it upfront, then defining it when adding sounds good enough. For dynamic skip, what we do is that we throw NonRetryableError in pre nav and then monkeypath log so that it is not logged, ugly but works
correct-apricot
correct-apricotOP•2y ago
If I'm reading the source code correctly, throwing in preNav hook would cause whole _runRequestHandler to fail = not running the requestHandler altogether I would like to dynamically skipNavigation, but still run requestHandler to so I can do "custom navigation there", e.g. via gotScraping
No description
André Mácola
André Mácola•5w ago
I'm looking for this to. ATM I'm using NonRetryableError but my logs are ugly. How do I suppress the logs? And I think too many NonRetryableError will cause Crawlee to fail?
Matous
Matous•3w ago
Hi, you should be able to throw from the preNav hook with no problem (just try it). To suppress the logs, you can override crawler's log completely, or you can just get inspiration here: https://discord.com/channels/801163717915574323/1394264565692371046

Did you find this page helpful?