stormy-gold•3y ago
puppeteer.connect()
Hey there! Is there still a way to connect
crawlee
to a remote browser instance using the browserWSEndpoint
parameter when normally calling puppeteer.connect()
?3 Replies
puppeteer
instance and connect and then pass it to Crawler via launcher. https://crawlee.dev/docs/examples/playwright-crawler-firefoxGitHub
crawlee/packages/browser-pool at master · apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js that helps you build reliable crawlers. Fast. - crawlee/packages/browser-pool at master · apify/crawlee
The key part you would need to override in the plugin is here: https://github.com/apify/crawlee/blob/master/packages/browser-pool/src/browser-pool.ts#LL633C35-L633C71
You would also need to make sure there is max 1 browser at a time if you would not connect to new browsers
GitHub
crawlee/browser-pool.ts at master · apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js that helps you build reliable crawlers. Fast. - crawlee/browser-pool.ts at master · apify/crawlee
Might be easier to just use BasicCrawler and create new pages there manually