deep-jade
deep-jade2y ago

Is it possible to get the selector of the individual links when using enqueueLinks()?

await enqueueLinks({
selector: 'a',
strategy: 'same-hostname',
forefront: true,
transformRequestFunction: (request) => {
request.userData.parentUrl = page.url();
request.userData.subRequest = true;
request.userData.linkSelector = xxx; // <-- Is it possible to save the selectors of each link?

return request;
},
});
await enqueueLinks({
selector: 'a',
strategy: 'same-hostname',
forefront: true,
transformRequestFunction: (request) => {
request.userData.parentUrl = page.url();
request.userData.subRequest = true;
request.userData.linkSelector = xxx; // <-- Is it possible to save the selectors of each link?

return request;
},
});
4 Replies
metropolitan-bronze
metropolitan-bronze2y ago
Hello @Juneberry, this is not possible using enqueueLinks. You can achieve this by parsing the page (eg. by using cheerio) and creating requests manually.
deep-jade
deep-jadeOP2y ago
@vojtechmaslan thank you. How would I pass the links manually? Would I still use enqueueLinks?
metropolitan-bronze
metropolitan-bronze2y ago
You can do something like this:
const requests = $('a').map((_, el) => {
// not sure what you want to have under linkSelector, these are just classes
const $el = $(el);
const linkSelector = $el.attr('class')?.split(' ').map((c) => `.${c}`).join('') ?? '';
return {
url: $el.attr('href'),
userData: { linkSelector }
};
}).toArray();
await crawler.addRequests(requests);
const requests = $('a').map((_, el) => {
// not sure what you want to have under linkSelector, these are just classes
const $el = $(el);
const linkSelector = $el.attr('class')?.split(' ').map((c) => `.${c}`).join('') ?? '';
return {
url: $el.attr('href'),
userData: { linkSelector }
};
}).toArray();
await crawler.addRequests(requests);
deep-jade
deep-jadeOP2y ago
Awesome, thank you!

Did you find this page helpful?