like-gold•2y ago
Scraping Lazada.
Hi everyone has anyone had success in scraping lazada retailer (i.e. https://www.lazada.co.th/products/2511-2-5-i5008341785-s21162506684.html) .
This is just 1 country example, they are active on multiple markets ( th, my, vn, sg, id).
I've had success in crawling the store pages using
&ajax=true
in their url of a store and retrying the request until it works.
i.e https://www.lazada.co.th/junkins/?q=All-Products&from=wangpu&langFlag=th&pageTypeId=2&ajax=true .
However I didn't had any success in accessing the product page itself as it always gives me a captcha which is not solvable and just asks you refresh the page.
I've tried puppeteer with live cookies generated from a 3rd party service with residential proxies and different puppeteer configs.
Tried to mimic a user as much as possible however I'm still getting blocked.
I'm using puppeteer on version-3.
Can anyone help find a solution to access their product page with a bot ?
Thanks in advance!!!1 Reply
automatic-azure•2y ago
Hi @Teodor , are you blocked when you start Chrome manually on your local computer and then connect Puppeteer to it via the CDP protocol?
I use this method when troubleshooting, because the browser environment is very close to a non-automated one (to circumvent browser fingerprinting technics).
For example you can start Chrome like this:
I use gost as an intermediate proxy server, to handle authentication to my residential proxy provider:
Then in your code you can start Puppeteer like this:
GitHub
gost/README_en.md at master · ginuerzh/gost
GO Simple Tunnel - a simple tunnel written in golang - ginuerzh/gost