unwilling-turquoise
unwilling-turquoise•3y ago

which ec2 instance type is best suited for crawling?

Just tried out a t3.small instance without much luck (running out of memory). tried a r3.large wich looks better but seems to be week on cpu. any hints?
6 Replies
MEE6
MEE6•3y ago
@bs just advanced to level 4! Thanks for your contributions! 🎉
Lukas Krivka
Lukas Krivka•3y ago
Depends a lot if you are using HTTP based or browser based. HTTP 1 GB + 1/4 CPU is ok, for browser at least 4x as much
quickest-silver
quickest-silver•3y ago
Just curious to know if someone wants to scrape 8-10k pages per day then will 4GB RAM VPS be sufficient? 8-10k pages over period of 24hrs not in a single go
Lukas Krivka
Lukas Krivka•3y ago
It should be for sure. But as I said, optimizing your scraper first can give you much more
quickest-silver
quickest-silver•3y ago
Any rough estimate like how much ram would need if I want to scrape 20k pages per day? Each request will spin up new browser instance and close it. 20k requests per day.
Lukas Krivka
Lukas Krivka•3y ago
Maybe 4 GB - 8 GB

Did you find this page helpful?