deep-jade
deep-jade•3y ago

Bypassing cookies consent

Hello everyone. I want to scrape data from Google Maps using Crawlee. However, it seems that, after scraping content of certain tag, I realize that content is about the Cookies consent that the first page of Google Maps shows you. Some of you may know that of you visit Google Maps for the first time, you will face a different page telling about accepting cookies consent and all that and after you click on accept all, you will be forwarded to the Maps itself. How can I make sure that I go straight to Google Maps so I can start scraping data by bypassing the consent or some way to accept the consent and automatically start scraping data from Google Maps right after that?
12 Replies
Alexey Udovydchenko
Alexey Udovydchenko•3y ago
Do not use cookies not session, on consent just retry (assuming you using proxies so retry will be done from new IP), you should be able to access data with like max 5 retries
deep-jade
deep-jadeOP•3y ago
I'm sorry, I don't really understand. I am a beginner when it comes to scraping, is it possible to explain it like I'm 5?
quickest-silver
quickest-silver•3y ago
well, I'm not Alexey, but I can explain... I try... So, step by step... 1. Are you using some pool of "rotating" proxies? Rotating - means every time you do an HTTP request the target website (Google maps or some other website) see this request coming from a DIFFERENT IP? Which rotating proxies you are using? Name of the service?
deep-jade
deep-jadeOP•3y ago
Thanks for explanation. I'm not using proxies. But somewhere in the Crawlee documentation says I can configure proxies right?
quickest-silver
quickest-silver•3y ago
I'm not using proxies.
This is the mistake number 1
quickest-silver
quickest-silver•3y ago
Go and read about using (rotating) proxies for scraping/crawling. You can start here https://developers.apify.com/academy/anti-scraping/mitigation/proxies#understanding-proxy-links
Apify
Proxies · Apify Developers
Learn all about proxies, how they work, and how they can be leveraged in a scraper to avoid blocking and other anti-scraping tactics.
quickest-silver
quickest-silver•3y ago
Almost always using some (rotating) proxies = you PAY somebody for providing it Here we discuss such services https://discord.com/channels/801163717915574323/1060179502392684594 and I am using smartproxy.com So, peek something...
xenial-black
xenial-black•3y ago
@1chbinamin See https://blog.apify.com/step-by-step-guide-to-scraping-google-maps/ As indicated Apify: "One slight caveat is that it's preferable to scrape such a huge website as Google Maps by using proxies, that way, it's faster and more efficient"
Apify
How to scrape Google Maps
Extract data without limits with this unofficial Google Maps API.
Lukas Krivka
Lukas Krivka•3y ago
Consent screen on Google Maps appears always for EU proxies, just use US only. The scraper in Apify Stores handles all of this - https://apify.com/compass/crawler-google-places. You need to dynamically click on the consent screen
deep-jade
deep-jadeOP•3y ago
Thank you everyone. I understand what I have to do now.
MEE6
MEE6•3y ago
@1chbinamin just advanced to level 1! Thanks for your contributions! 🎉
metropolitan-bronze
metropolitan-bronze•3y ago
Did you ever solve this @1chbunamin ?

Did you find this page helpful?