Bind session and proxy together
Hi,
I have a small problem, my sessions and proxies don't stay together, which I expected to be the default.
When I log the session id and port from the list in my router, the proxy.sessionId does not match the session.id.
Results look like:
I don't know if the session may change afterwards after the proxy is assigned:
* https://github.com/apify/crawlee/blob/master/packages/browser-crawler/src/internals/browser-crawler.ts#L504
* https://github.com/apify/crawlee/blob/master/packages/browser-crawler/src/internals/browser-crawler.ts#L534
4 Replies
I created an issue for this: https://github.com/apify/crawlee/issues/2503
GitHub
Proxy changes for same session · Issue #2503 · apify/crawlee
Which package is this bug report for? If unsure which one to select, leave blank @crawlee/browser (BrowserCrawler) Issue description According to the documentation the proxies and sessions are boun...
They should stay together until the session has been discarded
Is there an option to log when a session is created/discarded? Because the first session is persistent and correct, the session related to the proxy is a random new one.
As described in the issue, I believe https://github.com/apify/crawlee/blob/master/packages/browser-crawler/src/internals/browser-crawler.ts#L531 is not working correctly, because the proxy is loaded above it and hence too early https://github.com/apify/crawlee/blob/master/packages/browser-crawler/src/internals/browser-crawler.ts#L505
INFO PlaywrightCrawler: session_AlZoomLhQU <- Session of window/session object
INFO PlaywrightCrawler: 10209 <- Proxy port
INFO PlaywrightCrawler: session_Dnha2MhDeX <- Session of proxy
....
INFO PlaywrightCrawler: session_AlZoomLhQU <- Persistet & correct session of window/session object
INFO PlaywrightCrawler: 10208 <- Proxy port
INFO PlaywrightCrawler: session_6jOviCJSHt <- Random new session of proxy
If you set debug log https://crawlee.dev/api/core/class/Log
Log | API | Crawlee
The log instance enables level aware logging of messages and we advise
to use it instead of
console.log()
and its aliases in most development
scenarios.
A very useful use case for log
is using log.debug
liberally throughout
the codebase to get useful logging messages only when appropriate log level is set
and keeping the console tidy in p...