How can I pass data extracted in the first part of the scraper to items that will be extracted later
Hi. I'm extracting prices of products. In the process, I have the main page where I can extract all the information I need except for the fees. If I go through every product individually, I can get the price and fees, but sometimes I lose the fee information because I get blocked on some products. I want to handle this situation. If I extract the fees, I want to add them to my product_item, but if I get blocked, I want to pass this data as empty. I'm using the "Router" class as the Crawlee team explains here: https://crawlee.dev/python/docs/introduction/refactoring. When I add my URL extracted from the first page as shown below, I cannot pass data extracted before:
I want something like this:
But I cannot do the above. How can I do it?
So, my final data will be showed as:
If I handle the data correctly I want something like this:
If I get blocked, I have something like this:
await context.enqueue_links(url='product_url', label='PRODUCT_WITH_FEES')I want something like this:
await context.enqueue_links(url='product_url', label='PRODUCT_WITH_FEES', data=product_item # type: dict)But I cannot do the above. How can I do it?
So, my final data will be showed as:
If I handle the data correctly I want something like this:
product_item = {product_id: 1234, price: 50$, fees: 3$}If I get blocked, I have something like this:
product_item = {product_id: 1234, price: 50$, fees: ''}