unwilling-turquoise•16mo ago
Blocking network requests with crawlee PuppeteerCrawler
I'm trying to block network requests from specific domains within
PuppeteerCrawler
but can't get it to work.
I'd like to run something like this:
But it has to be initiated before page.goto.
I tried adding it to preNavigationHooks
like so:
But this returns Error: Request is already handled!
Is there a way to do this with PuppeteerCrawler
?3 Replies
Hey, when you're using multiple Intercept Handlers, you need to check if a request has already been handled:
if (interceptedRequest.isInterceptResolutionHandled()) return;
. Take a look at this: https://pptr.dev/guides/network-interception#multiple-intercept-handlers-and-asynchronous-resolutions.Request Interception | Puppeteer
Once request interception is enabled, every request will stall unless it's
Just be aware that request interception disables cache which makes large crawls much worse performance wise
@kennysmithnanic Also you can check
blockRequest
method from PuppeteerCrawlerContext: