Issues with charset
I am new to Apify. I love the utility of it so I have decided to learn it by using it to solve real life issue - by scraping data from official government website.
I am using cheerio scraper to get data from a list (link attached below) with Czech text. My problem is I cannot make it to get the the data with correct encoding. Characters from Czech alphabet are encoded incorrectly.
It scrapes this: "Apoďż˝tolskďż˝ cďż˝rkev, 1. sbor Praha" (with windows-1250) or this: Apo�tolsk� c�rkev, 1. sbor Praha (with utf8) instead of this: Apoštolská církev, 1. sbor Praha
I have tried experimenting forcing different response encoding (utf8, windows-1250), I tried sending different headers but without success.
After many hours I feel like getting nowhere. Do You have any suggestions?
Start URL:
https://www-cns.mkcr.cz/cns_internet/CNS/Seznam_cpo.aspx?id_subj=148&str_zpet=Seznam_CPO.aspxGlob pattern:
https://www-cns.mkcr.cz/cns_internet/CNS/Detail_cpo.aspx?id_subj=*&str_zpet=Seznam_CPO.aspxLink selector:
td > aCode:
BTW: I am using proxy located in CZ to get to it.
