quickest-silver
quickest-silver3y ago

Parse RSS XML

I'm trying to parse a RSS feed, I add it to additionalMimeTypes, but then it writes a ERROR CheerioCrawler: Request failed and reached maximum retries. ReferenceError: $ is not defined Any advice how to do it correctly, please? Thanks a lot!
No description
5 Replies
HonzaS
HonzaS3y ago
can you share the RSS feed url? you can try to change content-type header in response in postNavigationHooks
quickest-silver
quickest-silverOP3y ago
Thanks Honza, Its here: https://mladypodnikatel.cz/feed Now I'm stuck like mentioned https://github.com/apify/crawlee/issues/271, althought that the bug is solved.
GitHub
Issues · apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js that helps you build reliable crawlers. Fast. - Issues · apify/crawlee
HonzaS
HonzaS3y ago
so add this to your crawler constructor
postNavigationHooks:[({response})=>{
response.headers['content-type'] = 'application/xhtml+xml';
}]
postNavigationHooks:[({response})=>{
response.headers['content-type'] = 'application/xhtml+xml';
}]
quickest-silver
quickest-silverOP3y ago
Perfect, it works 🥳 Thank you
genetic-orange
genetic-orange13mo ago
@HonzaS saving my ass again ❤️

Did you find this page helpful?