living-lavenderL
Apify & Crawleeβ€’2y agoβ€’
3 replies
living-lavender

Blocking network requests with crawlee PuppeteerCrawler

I'm trying to block network requests from specific domains within PuppeteerCrawler but can't get it to work.

I'd like to run something like this:
page.on('request', (req) => {
                // If the URL doesn't include our keyword, ignore it
                if (req.url().includes('bouncex')) {
                    req.abort();
                    return;
                };
                req.continue();
            });

But it has to be initiated before page.goto.

I tried adding it to preNavigationHooks like so:
preNavigationHooks: [
        async ({ page }, goToOptions) => {
            goToOptions!.waitUntil = "networkidle2";
            goToOptions!.timeout = 3600000;
            await blocker.enableBlockingInPage(page);
            page.on('request', (req) => {
                // If the URL doesn't include our keyword, ignore it
                if (req.url().includes('bouncex')) {
                    req.abort();
                    return;
                };
                req.continue();
            });
            await page.setViewport(viewportConfig);
        },
    ],

But this returns Error: Request is already handled!

Is there a way to do this with PuppeteerCrawler?
Was this page helpful?