wet-aquaW
Apify & Crawlee2y ago
1 reply
wet-aqua

SC-CH-UA header includes 'Headless Chrome' when using @sparticuz/chromium

I've been playing around with deploying PlaywrightCrawler to AWS Lambda and it's working well. I've used @sparticuz/chromium for the chrome exe as per this doc: https://crawlee.dev/docs/deployment/aws-browsers
However, upon examining the request headers it's generating, I've discovered the sec-ch-ua hint header is always as follows:

"HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"


I've restricted the fingerprint generation options to 'Chrome' and the User-Agent header is nicely randomized (always chrome, but with variations).
I've also observed the version of chrome in the 2 headers doesn't always match for example:

"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML,, like Gecko) Chrome/116.0.0.0 Safari/537.36,
"sec-ch-ua": "HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"


This difference is surely going to make PlaywrightCrawler significantly easier to detect by anti-bot systems?

Running the same code locally (not using @sparticuz) and it looks fine -

 "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
 "sec-ch-ua": "\"Chromium\";v=\"124\", \"Google Chrome\";v=\"124\", \"Not-A.Brand\";v=\"99\""


Is there something I can do/set in order to get the sec-ch-ua and user-agent headers aligned when using the @sparticuz/chromium?

Thanks
Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.
Browsers on AWS Lambda | Crawlee · Build reliable crawlers. Fast.
Was this page helpful?