vicious-gold
vicious-gold2y ago

Using `transformRequestFunction` in `enqueueLinks` overrides `label`

I am using enqueueLinks to pass some URLs to another handler. This works fine when I do:
await enqueueLinks({
label: "LIST_PLACES",
urls: searchSquareLinks,
strategy: "same-domain",
userData,
})
await enqueueLinks({
label: "LIST_PLACES",
urls: searchSquareLinks,
strategy: "same-domain",
userData,
})
but as soon as I add a transformRequestFunction the label is overridden and it queues the links back to the handler from which is is being queued:
await enqueueLinks({
label: "LIST_PLACES",
urls: searchSquareLinks,
transformRequestFunction: (request) => {
request.userData = {
...userData,
zoomTarget: request.url.match(zoomRegex)?.[1],
}
return request
},
strategy: "same-domain",
userData,
})
await enqueueLinks({
label: "LIST_PLACES",
urls: searchSquareLinks,
transformRequestFunction: (request) => {
request.userData = {
...userData,
zoomTarget: request.url.match(zoomRegex)?.[1],
}
return request
},
strategy: "same-domain",
userData,
})
Why is the label being overridden when the only property of request being changed is the zoomTarget?
4 Replies
Lukas Krivka
Lukas Krivka2y ago
I assume the userData is the current userData? That might be a bug in Crawlee since the label is piped theough userData. Can you please copy this to https://github.com/apify/crawlee/issues?
GitHub
Issues · apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js that helps you build reliable crawlers. Fast. - Issues · apify/crawlee
vicious-gold
vicious-goldOP2y ago
GitHub
label being overridden in enqueueLinks when using `transformReq...
Which package is this bug report for? If unsure which one to select, leave blank @crawlee/core Issue description I am using enqueueLinks to pass some URLs to another handler. This works fine when I...
vicious-gold
vicious-goldOP2y ago
This was resolved - for anyone else coming across this, request.label is a shortcut for request.userData.label so the transform function was overwriting the label, but I couldn't see it because I thought label was a separate property.
eastern-cyan
eastern-cyan2y ago
Nice! I'm actually experiencing another issue with same-domain strategy not working as expected and I wonder if it's becuase I'm also overriding w/ transformRequestFunction

Did you find this page helpful?