Difference between enqueueLinks and crawler.addRequests
Difference between enqueueLinks and crawler.addRequests
At a glance
The community member is having an issue with the enqueueLinks function in the Crawlee library (version 3.7.0). When they call enqueueLinks({urls, label:'DETAIL'}), the links are not enqueued, and the crawler stops. However, when they use crawler.addRequests(filteredLinks.map(link=>({url:link, label:DETAIL}))), the links are added as expected, and the crawler works fine.
The community members discuss possible reasons for this issue. One suggests that the URLs being enqueued are absolute URLs (e.g., 'https://google.com') instead of relative ones, but they don't think this should be the reason. Another community member mentions that there might be a configuration option to control whether the crawler enqueues links from the same domain or all domains, and this could be the problem.
The answer is provided by a community member, who states that the issue is related to the enqueueing strategy and that the community member needs to use the "all" enqueueing strategy to enqueue both relative and absolute links.
Additionally, another community member suggests that the community member should use the context-aware enqueueLinks function provide
Hey folks, I have a list of urls like ["https//google.com"] etc, and when I call enqueueLinks({urls, label:'DETAIL'}), none of the links are enqueued and the crawler stops right there, but if I do
the links are added as expected and the crawler works fine, I just wanted what's the difference between the two and why enqueueLinks was not working here? crawlee ver is 3.7.0
the only thing I can figure is that the urls I am enqueuing are absolute urls e.g. 'https://google.com' instead of relative ones, but this shouldnt be the reason right?
If you want to automatically find and enqueue links, you should use the context-aware enqueueLinks function provided on the crawler contexts. Otherwise it will filter out provided URLs array and in your example its already single URL, see https://crawlee.dev/api/core/function/enqueueLinks