sensitive-blue
sensitive-blue7mo ago

Wiping session between inputs

Hello! I'm crawling / scraping a site which involves doing the following steps for each input. 1. Entering some data 2. doing a bunch of "Load more" 3. Collect output The problem is that the site differs its experience on the first entry vs the second, and it would be nice to run these in parallel. So I was going to use a new "session" for each one as that no data is retained between inputs, but I can't see how to do that. I'm guessing the site uses session cookies, localStorage, or some combination as I can't see how to get it clean. I almost just want like each request in a new incognito tab, haha. Any tips?
6 Replies
Hall
Hall7mo ago
Someone will reply to you shortly. In the meantime, this might help:
Pepa J
Pepa J7mo ago
Hi @BageDevimo You can try clear the cookies and other stuff in the preNavigation hook configured in the PlaywrightCrawler options:
const crawler = new PlaywrightCrawler({
// ...
preNavigationHooks: [
async ({ page }) => {
await page.context().clearCookies();
await page.evaluate(() => {
localStorage.clear();
});
},
],
// ...
});
const crawler = new PlaywrightCrawler({
// ...
preNavigationHooks: [
async ({ page }) => {
await page.context().clearCookies();
await page.evaluate(() => {
localStorage.clear();
});
},
],
// ...
});
sensitive-blue
sensitive-blueOP7mo ago
I'll try now! Will this cause problems with concurrent sessions though?
other-emerald
other-emerald7mo ago
Ive solved this using 'useIncognitoPages' in the launch context options. It makes each page a fresh context
sensitive-blue
sensitive-blueOP7mo ago
This works very well, thank you @Crafty !
other-emerald
other-emerald7mo ago
Npnp!

Did you find this page helpful?