harsh-harlequin•2y ago

Please HELP: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory,

please if you could help me, i want to integrate the crawlee cheerio in my nextjs app and when i try to run it i got this error over and over, i tried so hard for 4 day to fix it:

WARN  CheerioCrawler: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory, open '/home/mhanda/true-assistant/.next/server/vendor-chunks/data_files/headers-order.json'

WARN  CheerioCrawler: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory, open '/home/mhanda/true-assistant/.next/server/vendor-chunks/data_files/headers-order.json'

here is my code :

export const collectLinksActions = async (url: string) => {
  // CheerioCrawler crawls the web using HTTP requests
  // and parses HTML using the Cheerio library.
  const crawler: CheerioCrawler = new CheerioCrawler({
    // Use the requestHandler to process each of the crawled pages.
    async requestHandler({ request, $, enqueueLinks, log }) {
      const title = $('title').text();
      log.info(`Title of ${request.loadedUrl} is '${title}'`);

      // Save results as JSON to ./storage/datasets/default
      await Dataset.pushData({ title, url: request.loadedUrl });

      // Extract links from the current page
      // and add them to the crawling queue.
      await enqueueLinks();
    },
  }, new Configuration({ persistStorage: false }));

  // Add first URL to the queue and start the crawl.
  await crawler.run(['https://crawlee.dev']);
}

export const collectLinksActions = async (url: string) => {
  // CheerioCrawler crawls the web using HTTP requests
  // and parses HTML using the Cheerio library.
  const crawler: CheerioCrawler = new CheerioCrawler({
    // Use the requestHandler to process each of the crawled pages.
    async requestHandler({ request, $, enqueueLinks, log }) {
      const title = $('title').text();
      log.info(`Title of ${request.loadedUrl} is '${title}'`);

      // Save results as JSON to ./storage/datasets/default
      await Dataset.pushData({ title, url: request.loadedUrl });

      // Extract links from the current page
      // and add them to the crawling queue.
      await enqueueLinks();
    },
  }, new Configuration({ persistStorage: false }));

  // Add first URL to the queue and start the crawl.
  await crawler.run(['https://crawlee.dev']);
}

1 Reply

Alexey Udovydchenko•2y ago

Sounds like runtime environment is corrupted, please try https://docs.apify.com/cli/ and run under official CLI

Please HELP: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory,

Did you find this page helpful?