harsh-harlequin
harsh-harlequin2y ago

Please HELP: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory,

please if you could help me, i want to integrate the crawlee cheerio in my nextjs app and when i try to run it i got this error over and over, i tried so hard for 4 day to fix it:
WARN CheerioCrawler: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory, open '/home/mhanda/true-assistant/.next/server/vendor-chunks/data_files/headers-order.json'
WARN CheerioCrawler: Reclaiming failed request back to the list or queue. ENOENT: no such file or directory, open '/home/mhanda/true-assistant/.next/server/vendor-chunks/data_files/headers-order.json'
here is my code :
export const collectLinksActions = async (url: string) => {
// CheerioCrawler crawls the web using HTTP requests
// and parses HTML using the Cheerio library.
const crawler: CheerioCrawler = new CheerioCrawler({
// Use the requestHandler to process each of the crawled pages.
async requestHandler({ request, $, enqueueLinks, log }) {
const title = $('title').text();
log.info(`Title of ${request.loadedUrl} is '${title}'`);

// Save results as JSON to ./storage/datasets/default
await Dataset.pushData({ title, url: request.loadedUrl });

// Extract links from the current page
// and add them to the crawling queue.
await enqueueLinks();
},
}, new Configuration({ persistStorage: false }));

// Add first URL to the queue and start the crawl.
await crawler.run(['https://crawlee.dev']);
}
export const collectLinksActions = async (url: string) => {
// CheerioCrawler crawls the web using HTTP requests
// and parses HTML using the Cheerio library.
const crawler: CheerioCrawler = new CheerioCrawler({
// Use the requestHandler to process each of the crawled pages.
async requestHandler({ request, $, enqueueLinks, log }) {
const title = $('title').text();
log.info(`Title of ${request.loadedUrl} is '${title}'`);

// Save results as JSON to ./storage/datasets/default
await Dataset.pushData({ title, url: request.loadedUrl });

// Extract links from the current page
// and add them to the crawling queue.
await enqueueLinks();
},
}, new Configuration({ persistStorage: false }));

// Add first URL to the queue and start the crawl.
await crawler.run(['https://crawlee.dev']);
}
1 Reply
Alexey Udovydchenko
Sounds like runtime environment is corrupted, please try https://docs.apify.com/cli/ and run under official CLI

Did you find this page helpful?