extended-salmon•17mo ago
elementHandle.$$: Target page, context or browser has been closed
Facing the above when attempting to scrape using a proxy pool with 5 dc ip’s. Using Playwright with puppeteer stealth plugin.
Any help troubleshooting much appreciated. Thanks in advance 🙂
7 Replies
extended-salmonOP•17mo ago
I’ve also removed playwright VS code extension as a number of GH issues mentioned the issue was present there… 🤔
Update: seems to be a memory issue. When assigning more proxies to the proxy pool it appears crawlee is scaling more aggressively causing memory to fill rapidly since Playwright isn’t exactly lightweight. The above issue occurs due to all instances of chrome being killed
Any recommendations on how to optimise here?
hi, are you using the latest version of Crawlee? there were some known memory issues that should be addressed in the latest version. Alternatively you can try using 3.8.2 for now if your problem persists and we will take a closer look
extended-salmonOP•16mo ago
Yes, using the latest version
Are there currently memory issues known to the internal team?
I’m struggling to get >5 concurrent playwright crawlers with 8GB memory
Adjusting the threshold in crawlee.json doesn’t do much as the browsers will just exit despite there being sufficient system memory
Can you give any more context or point me to where I can read more? @lemurio
@Phonebox just advanced to level 2! Thanks for your contributions! 🎉
@Phonebox hm, the issues were known and should be resolved in version 3.9.2. Would you be able to provide a small reproduction where this issue occurs?
extended-salmonOP•16mo ago
Yes I will move some logic around and send a minimal example over shortly
Are there any recommendations for more lightweight headless browsers and how to configure these with crawlee?
Chromium should be the most lightweight, but it also the default now. You could also try to limit the max concurrency