wet-aquaW
Apify & Crawlee3y ago
4 replies
wet-aqua

Dataset.open(..) doesn't init dataset - when called outside of handler

Hi

Due to performance issues - I want to move out from handler all possible awaits.
For example here:

router.addHandler('details', async ({request, page, enqueueLinks, log}) => { const data = await page.evaluate(() => { // collect data .. return collectedData; }); const dataset = await Dataset.open('myData'); await dataset.pushData(data); })

I want to move out from handler - init of dataset - like:

const dataset = await Dataset.open('myData'); router.addHandler('details', async ({request, page, enqueueLinks, log}) => { const data = await page.evaluate(() => { // collect data .. return collectedData; }); await dataset.pushData(data); })

but now dataset is not initialised on crawlee start.
Folder ./storage/datasets/myData is not created.
And I get log
WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. Dataset with id: e4901ade-57c3-49ec-8300-5a96338d381b does not exist.

How can I properly init dataset in this case?
Thank you
Cheers
GT
Was this page helpful?