Apify Discord Mirror

Updated 5 months ago

Error handling / best practices

At a glance

The community member is using pre-built actors in their application to create a dataset. They are interested in improving the error handling of this approach and are wondering what types of errors or issues they could encounter, such as if the actor breaks (memory/CPU/other issue) or they get an exception, or if there are errors in the dataset (400 status code, crawler blocked, etc.). The community member is looking for best practices and recommendations on how to handle these situations.

In the comments, another community member congratulates the original poster on advancing to level 1.

Hello,
I am using pre-built actors in my application. I use them like this to create the dataset:
Plain Text
    client = ApifyClientAsync(token=settings.APIFY_API_TOKEN)
    run = await client.actor(actor.value).start(run_input=run_input)
    processed = 0
    while True:
        await asyncio.sleep(2)
        data = client.dataset(run["defaultDatasetId"]).iterate_items(offset=processed)
        async for item in data:
            dataset.append(item)
            processed += 1
            logger.info(f"processing item: {item.get('url')}")
        run_status = await client.run(run["id"]).get()
        if run_status.get("status", None) == "RUNNING":
            logger.info("Run is still running")
            continue
        else:
            logger.info("Run is finished.")
            break

I want to improve the error handling of this approach. I am wondering which types of errors or issues I could encounter and what the best practices are. Example: What happens if the actor breaks (memory/cpu/other issue) or I get an exception (which types)? What if there are errors in the dataset (400 status code, crawler blocked, etc.). Does anyone have recommendations here? Thank you!!
A
1 comment
just advanced to level 1! Thanks for your contributions! πŸŽ‰
Add a reply
Sign up and join the conversation on Discord