Jack Wenyoung
Jack Wenyoung3mo ago

How to combine the scraping results for Crawlee Playwright actor?

Hello folks! I'm building an actor with Crawlee and playwright for bbb scraping. I have completed the "detail" content parsing, and I made two booleans inputs to let users decide whether to scrape reviews, complaints or not. Then I realized if they choose both, I'll need to send two more requests and I don't know how to use the router to combine the results(the reviews, complaints and detail results) before using Dataset.pushData.
No description
Solution:
You need a way to persist state per entity (e.g., one business from BBB) across multiple requests, and only pushData() once all requested pieces (detail + reviews + complaints) are collected. In Crawlee, you do this by: ``` 1. Storing a “partial result” in request.userData....
Jump to solution
2 Replies
Solution
Exp
Exp3mo ago
You need a way to persist state per entity (e.g., one business from BBB) across multiple requests, and only pushData() once all requested pieces (detail + reviews + complaints) are collected. In Crawlee, you do this by:
1. Storing a “partial result” in request.userData.
2. Passing it along to subsequent requests.
3. Once all needed data are gathered, pushing it to the dataset.
1. Storing a “partial result” in request.userData.
2. Passing it along to subsequent requests.
3. Once all needed data are gathered, pushing it to the dataset.
Jack Wenyoung
Jack WenyoungOP3mo ago
Oh I see, that make sense, thank you! This solves my problem.🫶

Did you find this page helpful?