like-gold•11mo ago
Scrape JSON and HTML responses in different handlers
I do not know how to scrape a website, that contains JSON and HTML responses
My scraper need to:
1. Send a request and parse a JSON response which contains a list of URL that I will enqueue.
2. Scrape those URLs but in HTML using cheerio or whatever is required to do so.
2 Replies
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
Hey,
For your task, I'd use 2 request handlers:
-
JSON
handler will handle the JSON response, it'll parse it and enqueue HTML requests
- HTML
handler will parse HTML response as usual with cheerio's $
JSON
and HTML
are request labels, you can read more about labels here. Basically, if you label a request with e.g. HTML
label, it will be handled with HTML
request handler.
Let me know if you have any questionsCrawling the Store | Crawlee · Build reliable crawlers. Fast.
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.