wet-aquaW
Apify & Crawlee4y ago
4 replies
wet-aqua

Dataset.open(..) doesn't init dataset - when called outside of handler

Hi

Due to performance issues - I want to move out from handler all possible
awaits
.
For example here:

router.addHandler('details', async ({request, page, enqueueLinks, log}) => {
  const data = await page.evaluate(() => {
    // collect data ..
    return collectedData;
  });
  const dataset = await Dataset.open('myData');
  await dataset.pushData(data);
})


I want to move out from handler - init of dataset - like:

const dataset = await Dataset.open('myData');

router.addHandler('details', async ({request, page, enqueueLinks, log}) => {
  const data = await page.evaluate(() => {
    // collect data ..
    return collectedData;
  });
  await dataset.pushData(data);
})


but now dataset is not initialised on crawlee start.
Folder
./storage/datasets/myData
is not created.
And I get log
WARN  PuppeteerCrawler: Reclaiming failed request back to the list or queue. Dataset with id: e4901ade-57c3-49ec-8300-5a96338d381b does not exist.


How can I properly init dataset in this case?
Thank you
Cheers
GT
Was this page helpful?