rare-sapphire•3y ago
Crawlee doesn't process newly enqueued links via enqueueLinks
Hi folks, I'm trying to build a crawler that retrieves a body (Buffer), and later enqueues the next "page" to be crawled, if it exists (has_next === true ). The problem is that
?page=1 gets processed but the enqueued page (via enqueueLinks) doesn't; Crawlee states that it has processed all links (1 of 1).
I have confirmed that has_next is indeed true and that enqueueLinks gets called.
Am I missing something obvious?
4 Replies
rare-sapphireOP•3y ago
Hi again, I just cleaned up the code example, to make it easier on the eyes.
you can use
crawler.addRequests function
like this:
ambitious-aqua•3y ago
agree with Honza on crawler.addRequests,
enqueueLinks's internal logic is quite complex – i often run the crawler in debug mode to see what is going on or check it's return value to see if it behaves as expected
rare-sapphireOP•3y ago
Thank you both @HonzaS and @strajk for answering, I've got things working by using
crawler.addRequests().