other-emerald
other-emerald2y ago

Instance not refreshing in API

I am building an API that crawls websites using crawlee's playwright crawler. Once the request goes to API endpoint to crawl, it crawls the website successfully but whenever new request with different details is made, It denies to crawl again.
No description
11 Replies
other-emerald
other-emeraldOP2y ago
Basically once the crawling is completed, its instance is not getting shut down and able to restart again
rare-sapphire
rare-sapphire2y ago
@Kunal Verma Did you find a solution for this problem? I'm facing the same issue.
other-emerald
other-emeraldOP2y ago
No, now trying to maintain a queue for requests and after each run dropping requestQueue
rare-sapphire
rare-sapphire2y ago
Ok, I did the same. Created named request queue and dropping it after crawl. How can we limit the max urls to crawl per requestQueue? This is not the same as maxRequestsPerCrawl.
MEE6
MEE62y ago
@Manish just advanced to level 1! Thanks for your contributions! 🎉
other-emerald
other-emeraldOP2y ago
I used maxpagestocrawl in the crawler only
rare-sapphire
rare-sapphire2y ago
I couldn't find a reference to maxpagestocrawl in Crawlee documentation. Can you point me to it? Thanks!
other-emerald
other-emeraldOP2y ago
Quick Start | Crawlee
With this short tutorial you can start scraping with Crawlee in a minute or two. To learn more, read the Introduction.
other-emerald
other-emeraldOP2y ago
No description
MEE6
MEE62y ago
@Kunal Verma just advanced to level 1! Thanks for your contributions! 🎉
rare-sapphire
rare-sapphire2y ago
Thank you so much. Somehow I missed it. 🙏🙏

Did you find this page helpful?