MA
MA2d ago

Initializing CloudFlare cookies with Crawlee

Hi, I am currently using a PlaywrightScrapper to initialize cloudflare cookies to then send requests to the website programmatically. My problem is that the target website does multiple redirects to itself before the CF cookie is ready, which I did not achieve to handle using my code. You know the cookie is ready when you get a 200 status code. Code :
const crawler = new PlaywrightCrawler({
async requestHandler({ request, page, log, response }) { // This is triggered only once on the first 403 response
log.info(`Processing ${response?.status()} ${request.url}...`);
const cookies = await page.context().cookies();
console.log(cookies);
await page.waitForTimeout(10000);
},
headless: false,
retryOnBlocked: false,
sessionPoolOptions: {
blockedStatusCodes: [429], // Do not block 403
},
requestHandlerTimeoutSecs: 99999, // Sometimes, website does multiple redirects before the cookie is ready
maxRequestRetries: 0,
proxyConfiguration: new ProxyConfiguration({
proxyUrls: ['http://user:pass@host:port'],
}),
});

await crawler.run(['https://www.example.com']);
const crawler = new PlaywrightCrawler({
async requestHandler({ request, page, log, response }) { // This is triggered only once on the first 403 response
log.info(`Processing ${response?.status()} ${request.url}...`);
const cookies = await page.context().cookies();
console.log(cookies);
await page.waitForTimeout(10000);
},
headless: false,
retryOnBlocked: false,
sessionPoolOptions: {
blockedStatusCodes: [429], // Do not block 403
},
requestHandlerTimeoutSecs: 99999, // Sometimes, website does multiple redirects before the cookie is ready
maxRequestRetries: 0,
proxyConfiguration: new ProxyConfiguration({
proxyUrls: ['http://user:pass@host:port'],
}),
});

await crawler.run(['https://www.example.com']);
This lib is probably not the best suited for my use case, but I liked how Crawlee made it simple to have a browser that's stealth and easy to interact with. Is Playwright the right crawler? Can somebody help me with the implementation please?
0 Replies
No replies yetBe the first to reply to this messageJoin

Did you find this page helpful?