conscious-sapphire•14mo ago
Scrape JSON and HTML responses in different handlers
I do not know how to scrape a website, that contains JSON and HTML responses
My scraper need to:
1. Send a request and parse a JSON response which contains a list of URL that I will enqueue.
2. Scrape those URLs but in HTML using cheerio or whatever is required to do so.
2 Replies
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
Hey,
For your task, I'd use 2 request handlers:
-
JSON handler will handle the JSON response, it'll parse it and enqueue HTML requests
- HTML handler will parse HTML response as usual with cheerio's $
JSON and HTML are request labels, you can read more about labels here. Basically, if you label a request with e.g. HTML label, it will be handled with HTML request handler.
Let me know if you have any questionsCrawling the Store | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.