Apify Discord Mirror

Updated 5 months ago

Using `transformRequestFunction` in `enqueueLinks` overrides `label`

At a glance

The community member is using enqueueLinks to pass URLs to another handler. When they add a transformRequestFunction, the label is overridden, and the links are queued back to the original handler. The community member is unsure why the label is being overridden when the only change is to the zoomTarget property.

Another community member suggests this might be a bug in Crawlee, as the label is piped through userData. The issue was submitted to the Crawlee repository, and it was later resolved. The resolution was that request.label is a shortcut for request.userData.label, so the transform function was overwriting the label, which the community member couldn't see initially.

The community member is now experiencing another issue with the same-domain strategy not working as expected and wonders if it's because they're also overriding with transformRequestFunction.

Useful resources
I am using enqueueLinks to pass some URLs to another handler. This works fine when I do:
Plain Text
await enqueueLinks({
    label: "LIST_PLACES",
    urls: searchSquareLinks,
    strategy: "same-domain",
    userData,
})

but as soon as I add a transformRequestFunction the label is overridden and it queues the links back to the handler from which is is being queued:
Plain Text
await enqueueLinks({
    label: "LIST_PLACES",
    urls: searchSquareLinks,
    transformRequestFunction: (request) => {
        request.userData = {
            ...userData,
            zoomTarget: request.url.match(zoomRegex)?.[1],
        }
        return request
    },
    strategy: "same-domain",
    userData,
})

Why is the label being overridden when the only property of request being changed is the zoomTarget?
L
c
b
5 comments
I assume the userData is the current userData? That might be a bug in Crawlee since the label is piped theough userData. Can you please copy this to https://github.com/apify/crawlee/issues?
This was resolved - for anyone else coming across this, request.label is a shortcut for request.userData.label so the transform function was overwriting the label, but I couldn't see it because I thought label was a separate property.
I'm actually experiencing another issue with same-domain strategy not working as expected and I wonder if it's becuase I'm also overriding w/ transformRequestFunction
Add a reply
Sign up and join the conversation on Discord