Apify Discord Mirror

Updated 5 months ago

saving data in apify actor

At a glance

The community member is trying to save scraped data to a JSON file, but is not getting the expected output. They want to save the data to the Apify console instead, so they can then use MongoDB to store the data in a database. The community member has already set up a MongoDB schema and is asking how to save the data to the Apify console, access it, and potentially clean the data through a separate actor before saving it to the MongoDB database.

In the comments, another community member suggests using await Actor.pushData(productData); to push the data to the Apify dataset. However, the community member is unsure how to define Actor in the console without getting an error during the build.

ive tried saving the data to a rawdata.json file from the data i scrape from my actors,

however i dont get a json output even thought the scraping works

how would i save the data to the apify console that i can then use mongodb to take that data and put it in my database -

i have my mongodb schema already setup so how would i save the data to the apify console and access it

would i have to save it to the apify dataset, if so how, and how would i also put it through a cleaning process through the same actor or if possible, a different actor and THEN save it to a mongodb database?'

would i have to download fs somehow in the apify console to make this work?

heres what i have for saving the json file so far:
h
P
3 comments
Plain Text
bambawRouter.addHandler('BAMBAW_PRODUCT', async ({ page, request }) => {
    try {
        console.log('Scraping products');

        const site = 'Bambaw';

        const title = await page.$eval('h1.product__title', (el) => el.textContent?.trim() || '');

        const descriptions = await ......

        const productData = {
        url: request.loadedUrl,
        site,
        title,
        descriptions,
        originalPrice,
        salePrice,
        shippingInfo,
        reviewScore,
        reviewNumber,
        };

        productList.push(productData);

        console.log('Scraped ', productList.length, ' products')
        // Read the existing data from the rawData.json file
        let rawData: any = {};
        try {
            const rawDataStr = fs.readFileSync('rawData.json', 'utf8');
            rawData = JSON.parse(rawDataStr);
        } catch (error) {
            console.log('Error reading rawData.json:', error);
        }

        // Append the new data to the existing data
        if (rawData.productList) {
            rawData.productList.push(productData);
        } else {
            rawData.productList = [productData];
        }

        // Write the updated data back to the rawData.json file
        fs.writeFileSync('rawData.json', JSON.stringify(rawData, null, 2));
        console.log('rawData.json updated for Bambaw');
    } catch (error) {
        console.log('Error scraping product:', error);
        bambawQueue.reclaimRequest(request);
        return;
    }    
I think
Plain Text
await Actor.pushData(productData);

is probably what you want, this will put one item to the dataset.
how do i define Actor in the console w/out getting an error during the build
Add a reply
Sign up and join the conversation on Discord