fascinating-indigo
fascinating-indigo2y ago

Anyone with a crawler with a lot of route handlers ? Like 100s of route handlers in a single crawler

Hello I'm thinking about ways of creating a central crawler (like a central cheerio crawler) that will have handlers for several websites, so when new requests arrive I won't need to instantiate a new cheerio crawler. Is this a good idea? My project already have 174 crawlers in separate scripts that I have to manage with pm2, it's becoming hell to maintain, anyone with tips or big projects like mine?
3 Replies
lemurio
lemurio2y ago
hi, yes of course, that would work. It would be also more effective this way as you would have only one instance running when you'd have multiple requests at the same time thanks to the automatic scaling
extended-salmon
extended-salmon2y ago
i am following, anyone an example of adding more handlers / routes in one single project
lemurio
lemurio2y ago
Router | API | Crawlee
Simple router that works based on request labels. This instance can then serve as a requestHandler of your crawler. ```ts import { Router, CheerioCrawler, CheerioCrawlingContext } from 'crawlee'; const router = Router.create(); // we can also use factory methods for specific crawling contexts, the above equals to: // import { createCheerioR...

Did you find this page helpful?