helpful-purple
helpful-purpleโ€ข8mo ago

how do i create organize 1 auth per session, ip, user agent ?

I want to create bunch of authenticated users, each with their consistent browser, proxy, user agent, fingerprints, schedule, browsing pattern, etc.
13 Replies
Hall
Hallโ€ข8mo ago
Someone will reply to you shortly. In the meantime, this might help:
helpful-purple
helpful-purpleOPโ€ข8mo ago
found it
xenial-black
xenial-blackโ€ข8mo ago
hi, would you mind sharing your solution? im facing a similar issue. ๐Ÿ™‚ Ta.
helpful-purple
helpful-purpleOPโ€ข8mo ago
For proxy - put proxy at the start, or put a function to return a new proxy everytime its called. Its in the constructor of crawlee scrapper For session: - set session pool config to have it invalidated upon a single error. - use preNavigate hook, in there put logic to do a check if context has session user data or if its signed in. If not, then we update session with a new user data and other pattern associated with the user, sign in user and attach the cookie to the context. (If theres user data it means user is signed in). - initiate session pool to be the same amount like the # of the accounts, so 1 user map to 1 session. For behaviour: - manual customization of user behaviour by relying on the context attached to the session userData.
MEE6
MEE6โ€ข8mo ago
@Vi just advanced to level 1! Thanks for your contributions! ๐ŸŽ‰
helpful-purple
helpful-purpleOPโ€ข8mo ago
i got it wrong, only newUrl is usable to associate 1 proxy with 1 session
No description
helpful-purple
helpful-purpleOPโ€ข8mo ago
so has to use this one
No description
helpful-purple
helpful-purpleOPโ€ข8mo ago
No description
helpful-purple
helpful-purpleOPโ€ข8mo ago
Optimizing web scraping: Scraping auth data using JSDOM | Crawlee ยท...
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
helpful-purple
helpful-purpleOPโ€ข8mo ago
Still doesnt work if max concurrency is more than 1.
MEE6
MEE6โ€ข8mo ago
@Vi just advanced to level 2! Thanks for your contributions! ๐ŸŽ‰
helpful-purple
helpful-purpleOPโ€ข8mo ago
I think its a bug Tried running it many times against a single url but with different unique key. Even with max session usage : 1, a single session keeps being reused many times. Maybe the session usage count get incremented after the request instead of before ? Or the batching is bugged

Did you find this page helpful?