sacred-emeraldS
Apify & Crawlee4y ago
37 replies
sacred-emerald

Crawlee vs bot detection systems - Plugins length is not OK

I tested PlaywrightCrawler on three bot detection sites (see [1], [2], [3] and the attached screenshots).
In all cases these sites complains about "0 plugins" or "Plugins length".

If I open these sites with browser I use every day (Firefox on Linux, by the way - the same as
used in PlaywrightCrawler settings) - these sites say "5 plugins" and the field is green.

Is it something in my code?
Can Crawlee emulate these plugins attributes?

[1] - https://infosimples.github.io/detect-headless/
[2] - https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html
[3] - https://webscraping.pro/wp-content/uploads/2021/02/testresult2.html

and here - part of the PlaywrightCrawler:
const crawler = new PlaywrightCrawler({
    ...
    browserPoolOptions: {
        useFingerprints: true,

        fingerprintOptions: {
            fingerprintGeneratorOptions: {
                browsers: ['firefox'],
                operatingSystems: ['linux'],
            },
        },
    },

    launchContext: {
        launcher: firefox
    },

});


Screenshots:
01-infosimples.github.io-19b9a46843518680ccc72bada5fe8b69.png
02-intoli.com-44d20f5d8ce2747086171e4aeecca746.png
03-webscraping.pro-f1fceabcc55af4353c0da1cddf3e72d7.png
Was this page helpful?