skinny-azureS
Apify & Crawlee2y ago
2 replies
skinny-azure

Apify in NestJS scheduler

Hello everyone
I am using Apify + Crawlee Cheerio Crawler + NestJS scheduler in my project, and getting issue NestJS process for running the server is quit when calling Apify.exit() . Below is my code
@Cron('0 */5 * * * *')
async handleEvery20Minutes() {
    const config = new Configuration({ purgeOnStart: true, persistStorage: false });
    let cheerioCrawler = new CheerioCrawler({
      minConcurrency: 10,
      maxConcurrency: 50,
    
      // On error, retry each page at most once.
      maxRequestRetries: 1,
    
      // Increase the timeout for processing of each page.
      requestHandlerTimeoutSecs: 30,
    
      // Limit to 10 requests per one crawl
      maxRequestsPerCrawl: 10,
      requestHandler: defaultRouter
    }, config);
    
    await Actor.init();
    const crawlingCodes = await this.codesService.findAllCodesUrl();
    for (let i = 0; i < crawlingCodes.length; i++) {
      await cheerioCrawler.addRequests([
        {
          url: crawlingCodes[i].url,
          userData: {
            code: crawlingCodes[i].name,
          },
          uniqueKey: uuidv4()
        },
      ]);
    }
    await cheerioCrawler.run();
    
    await cheerioCrawler.teardown();
    
    await Actor.exit(); //when the NestJS scheduler  running at this line, it quits 
}

I would like to call
Actor.exit()
to reset the index of data json files. I can remove
Actor.exit()
but will get this error
[Nest] 43924  - 06/08/2024, 8:30:02 PM   ERROR [Scheduler] Error: ENOENT: no such file or directory, open '/storage/datasets/default/000000001.json'

Does anyone has this similar issue when running Apify Crawlee on NestJS framework ? Can you please help ?
Thank you
image.png
Was this page helpful?