afraid-scarlet
afraid-scarlet2y ago

Instance not refreshing in API

I am building an API that crawls websites using crawlee's playwright crawler. Once the request goes to API endpoint to crawl, it crawls the website successfully but whenever new request with different details is made, It denies to crawl again.
No description
11 Replies
afraid-scarlet
afraid-scarletOP2y ago
Basically once the crawling is completed, its instance is not getting shut down and able to restart again
realistic-cyan
realistic-cyan2y ago
@Kunal Verma Did you find a solution for this problem? I'm facing the same issue.
afraid-scarlet
afraid-scarletOP2y ago
No, now trying to maintain a queue for requests and after each run dropping requestQueue
realistic-cyan
realistic-cyan2y ago
Ok, I did the same. Created named request queue and dropping it after crawl. How can we limit the max urls to crawl per requestQueue? This is not the same as maxRequestsPerCrawl.
MEE6
MEE62y ago
@Manish just advanced to level 1! Thanks for your contributions! 🎉
afraid-scarlet
afraid-scarletOP2y ago
I used maxpagestocrawl in the crawler only
realistic-cyan
realistic-cyan2y ago
I couldn't find a reference to maxpagestocrawl in Crawlee documentation. Can you point me to it? Thanks!
afraid-scarlet
afraid-scarletOP2y ago
Quick Start | Crawlee
With this short tutorial you can start scraping with Crawlee in a minute or two. To learn more, read the Introduction.
afraid-scarlet
afraid-scarletOP2y ago
No description
MEE6
MEE62y ago
@Kunal Verma just advanced to level 1! Thanks for your contributions! 🎉
realistic-cyan
realistic-cyan2y ago
Thank you so much. Somehow I missed it. 🙏🙏

Did you find this page helpful?