Large threaded, kubernetes scrape = Target page, context or browser has been closed

Ironically @cryptorex just posted a similar issue (TargetClosedError: Target page, context or browser has been closed (I've tried a lot)) but I wanted to provide some additional context to see if they're related.

I'm:
  • Running a node app with worker threads (usually 32 of them)
  • Running multiple containers in kubernetes
Each thread:
  • Grabs 5 domains from my postgres DB (of 5 million!)
  • Loops through each domain
  • Creates a new PlaywrightCrawler with unique-named storages (to prevent collision / global deletion from crawlers in other threads)
  • Queues the domains home page
  • Controllers then queue up some additional pages based on what's found on the home page
  • The results are processed in real-time and pushed to the database (since we don't want to wait until all 5M all are complete
  • The thread-specific sotrages are then deleted used drop()
The Problem
This works flawlessly... for about 60 minutes... afterwards, I get plagued with
Target page, context or browser has been closed
. It appears at the ~ hour mark is when this first presents itself and then incrementally increases in frequency until I'm getting more failed records than successful (at which point, I kill the cluster or restart it).

What I've tried:
  • browserPoolOptions
    like
    retireBrowserAfterPageCount: 100
    and
    closeInactiveBrowserAfterSecs: 200
  • await crawler.teardown();
    in hopes that this would clear and sort of cache/memory that could be stacking up
  • A cron to restart my cluster 🤣
  • Ensuring the EBS volumes are not running out of space (they're 20GB each and seem to be 50% full when crashing)
  • Ensuring the pods have plenty of memory (running EC2s with 64GB memory and 16 CPU (32 threads). Seems to handle the load in the first hour just fine.
I suspect there's a leak or store not being cleared out since it happens gradually?
Was this page helpful?
Large threaded, kubernetes scrape = Target page, context or browser has been closed - Apify & Crawlee