Apify Discord Mirror

Updated 5 months ago

not skipping over urls for unfound elements

At a glance

The community member is scraping product data from URLs and encountering issues where certain elements are not found, causing errors and preventing the rest of the data from being scraped. They have tried using try-catch statements and if-else statements to handle these cases, but the built-in crawlee error is still occurring. The community members have provided code examples and discussed the specific issue they are facing with the sale price and new tag elements, but have not found a solution that avoids the errors and allows the rest of the data to be scraped successfully.

when i am scraping product data from product urls, if i am trying to either see whether a tag is available and if not to use a different tag or if a tag simply isn't found, i don't want it to give a full error for not finding that certain element i want and not scrape and save the rest of the data
how do i avoid this "skipping" over by overriding or changing the natural response of the crawler

i even have tried try catch statements and if else statements and nothing works
h
3 comments
code:
Plain Text
lifeWithoutPlasticRouter.addHandler('LIFE_WITHOUT_PLASTIC_PRODUCT', async ({ page, request }) => {
    try {

        await page.goto(page.url(), { waitUntil: 'domcontentloaded' })

        console.log('Scraping products');

        const storeName = 'Life Without Plastic';

        const title = await page.$eval('h1.product-title', (el) => el.textContent?.trim() || '');
        
        let image = await page.$eval('a.product-image', (img) => img.getAttribute('href'));

        let description = await page.$$eval('div.product-description-wrapper p', (paragraphs) => {
          return paragraphs.map((p) => p.textContent?.trim()).join(' ');
        });
      
        let salePrice = await page.$eval('span.price-value', (el) => el.textContent?.trim() || '');
        let newTag = await page.$eval('span.price-ns', (el) => el.textContent?.trim() || '');
        let originalPrice = salePrice;

        if(newTag){
          originalPrice = newTag;
        }else{
          return
        }
        originalPrice = originalPrice.replace("$", "")
        originalPrice = originalPrice.replace("USD", "")

        salePrice = salePrice.replace("$", "")
        salePrice = salePrice.replace("USD", "")

        const shippingInfo = 'Free Shipping on orders above $100';
       ...
});
especially this here - it doesn't work to avoid an error and even using a try catch statement that tries to use that different tag and catches it andjust logs an error and returns doesnt work either:
Plain Text
        let salePrice = await page.$eval('span.price-value', (el) => el.textContent?.trim() || '');
        let newTag = await page.$eval('span.price-ns', (el) => el.textContent?.trim() || '');
        let originalPrice = salePrice;

        if(newTag){
          originalPrice = newTag;
        }else{
          return
        }
ive tried all different combinations to catch errors but it doesn't avoid the built in crawlee error
Add a reply
Sign up and join the conversation on Discord