Apify Discord Mirror

Updated 2 years ago

Storage of data or returning of results

At a glance

The community member is asking if they can return results to a promise or callback when using the Crawlee SDK with the PlaywrightCrawler, or if they can only write to Datasets and retrieve the data later. The comments indicate that the community member can either push data to a Dataset or set values in a KV store, and that temporary results can be added to the userData in the request object. Another community member suggests using the useState feature of the BasicCrawler to store global state. One community member notes that Crawlee doesn't force the default data storage method, and the community member can override the push function or write their own callback. The community member also found a solution by referring to the Crawlee documentation on upgrading to version 3. Finally, some community members discuss the use of Datasets and how to loop through the results later in the code.

Useful resources
Hello, this shouldn't take long.

Am I reading correctly (and have tested) that returning results to a promise or a callback isn't an option with this SDK (crawlee w/ new PlaywrightCrawler() for example) ?

We can only write to Datasets and retrieve later for use?
2
A
c
t
9 comments
Yes, either push data to dataset or set value in KV store (directly or by state management), everything else is out of SDK. Temp results might be added to userData in request object.
Thanks for your response! I’ll explore temp results to request object as well.
If you're trying to store a global state that's accessible to the entire run, I recommend using useState

https://crawlee.dev/api/basic-crawler/class/BasicCrawler#useState
yes, thanks for this!
Crawlee doesn't force you to store data the default way. You can override the push function or write your own callback.
I don't understand the datasets. Why do you store the results as seperate files?
How do you loop through the whole results later?
I want to store an array of object and loop through it later in the code
You can load the dataset items and then loop through them and store as you wish.
Add a reply
Sign up and join the conversation on Discord