conscious-sapphire
conscious-sapphire•3y ago

Using Apify a-sync

Hi there, I'm using Apify for quiet a while now - great product! Yet, to get the dataset items of a scraped shop we have only 2 options-"Run actor synchronously and get dataset items" or "Get last run dataset items". It becomes problematic as long-synchronous-requests could get interrupted, and it looks as if it's only possible to retrieve just the last run results - which means that we can work just on 1 website at a time (no parallelism) Any work around? A good option would be to send a request to scrape website X, get the run id Y and then just "get run id (Y) dataset items". I also checked the "Integrations" - "HTTP Webhook" option but from the available variables it doesn't give access to the dataset items themselves. (<- This would be the best) Any ideas? Thanks, Arseni
3 Replies
Alexey Udovydchenko
Alexey Udovydchenko•3y ago
Its actually common approach:
const actorRun = await Actor.call(name, input, runOptions);
const actorDataset = await Actor.openDataset(actorRun.defaultDatasetId, { forceCloud: true });
const actorRun = await Actor.call(name, input, runOptions);
const actorDataset = await Actor.openDataset(actorRun.defaultDatasetId, { forceCloud: true });
If you want do it by API calls instead of SDK click "API" button at top right of actor web page, you will see endpoint and link to API manual
Lukas Krivka
Lukas Krivka•3y ago
The async flow in explained fully in this article - https://docs.apify.com/tutorials/run-actor-and-retrieve-data-via-api
Apify
Run actor and retrieve data via API · Apify Documentation
Learn how to run an actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating actors with your projects.
conscious-sapphire
conscious-sapphireOP•3y ago
Thanks you very much! I would really suggest to add the dataset iteams endpoint ('https://api.apify.com/v2/datasets/<defaultDatasetId>/items') to the menu that opens when clicking the API button top-right. It's really confusing when it doesn't show up there. Solved 🙂

Did you find this page helpful?