correct-apricot
correct-apricot4mo ago

Wiping session between inputs

Hello! I'm crawling / scraping a site which involves doing the following steps for each input. 1. Entering some data 2. doing a bunch of "Load more" 3. Collect output The problem is that the site differs its experience on the first entry vs the second, and it would be nice to run these in parallel. So I was going to use a new "session" for each one as that no data is retained between inputs, but I can't see how to do that. I'm guessing the site uses session cookies, localStorage, or some combination as I can't see how to get it clean. I almost just want like each request in a new incognito tab, haha. Any tips?
6 Replies
Hall
Hall4mo ago
Someone will reply to you shortly. In the meantime, this might help:
Pepa J
Pepa J4mo ago
Hi @BageDevimo You can try clear the cookies and other stuff in the preNavigation hook configured in the PlaywrightCrawler options:
const crawler = new PlaywrightCrawler({
// ...
preNavigationHooks: [
async ({ page }) => {
await page.context().clearCookies();
await page.evaluate(() => {
localStorage.clear();
});
},
],
// ...
});
const crawler = new PlaywrightCrawler({
// ...
preNavigationHooks: [
async ({ page }) => {
await page.context().clearCookies();
await page.evaluate(() => {
localStorage.clear();
});
},
],
// ...
});
correct-apricot
correct-apricotOP4mo ago
I'll try now! Will this cause problems with concurrent sessions though?
exotic-emerald
exotic-emerald4mo ago
Ive solved this using 'useIncognitoPages' in the launch context options. It makes each page a fresh context
correct-apricot
correct-apricotOP4mo ago
This works very well, thank you @Crafty !
exotic-emerald
exotic-emerald4mo ago
Npnp!

Did you find this page helpful?