unwilling-turquoise•3y ago
which ec2 instance type is best suited for crawling?
Just tried out a t3.small instance without much luck (running out of memory). tried a r3.large wich looks better but seems to be week on cpu. any hints?
6 Replies
@bs just advanced to level 4! Thanks for your contributions! 🎉
Depends a lot if you are using HTTP based or browser based. HTTP 1 GB + 1/4 CPU is ok, for browser at least 4x as much
quickest-silver•3y ago
Just curious to know if someone wants to scrape 8-10k pages per day then will 4GB RAM VPS be sufficient? 8-10k pages over period of 24hrs not in a single go
It should be for sure. But as I said, optimizing your scraper first can give you much more
quickest-silver•3y ago
Any rough estimate like how much ram would need if I want to scrape 20k pages per day? Each request will spin up new browser instance and close it. 20k requests per day.
Maybe 4 GB - 8 GB