frail-apricot
frail-apricot2y ago

Difference between enqueueLinks and crawler.addRequests

Hey folks, I have a list of urls like ["https//google.com"] etc, and when I call enqueueLinks({urls, label:'DETAIL'}), none of the links are enqueued and the crawler stops right there, but if I do
crawler.addRequests(filteredLinks.map(link=>({url:link, label:DETAIL})))
crawler.addRequests(filteredLinks.map(link=>({url:link, label:DETAIL})))
the links are added as expected and the crawler works fine, I just wanted what's the difference between the two and why enqueueLinks was not working here? crawlee ver is 3.7.0
4 Replies
frail-apricot
frail-apricotOP2y ago
the only thing I can figure is that the urls I am enqueuing are absolute urls e.g. 'https://google.com' instead of relative ones, but this shouldnt be the reason right? because I ran another scraper of mine and that one worked fine Ok, it really is relative links only but why?
correct-apricot
correct-apricot2y ago
@AltairSama2 there is, i believe, a config options to whether its enqueuing the same domain or all domains. Could that be your problem?
frail-apricot
frail-apricotOP2y ago
yeah that was the issue, There's a discussion on github on it.you need to use enqueueing strategy all
Alexey Udovydchenko
If you want to automatically find and enqueue links, you should use the context-aware enqueueLinks function provided on the crawler contexts. Otherwise it will filter out provided URLs array and in your example its already single URL, see https://crawlee.dev/api/core/function/enqueueLinks

Did you find this page helpful?