Confusion around configuring Crawlee through a tor proxy

Here is the code I'm working with currently:
import { PlaywrightCrawler, ProxyConfiguration } from 'crawlee';
import { firefox } from 'playwright';


const startUrls = ['https://crawlee.dev'];
const BBCNewsOnionStartUrls = ['https://www.bbcnewsd73hkzno2ini43t4gblxvycyac5aw4gnv7t2rccijh7745uqd.onion/'];
const proxyConfiguration = new ProxyConfiguration({ proxyUrls: ['socks5://localhost:9050'] });

const crawler = new PlaywrightCrawler({
    launchContext: {
        launcher: firefox,
        launchOptions: {
            proxy: {
                server: 'socks5://localhost:9050'
            },
            headless: false,
        }
    },
    proxyConfiguration: proxyConfiguration,
    requestHandler: async ({ request, page, log }) => {
        const pageTitle = await page.title();
        log.info(`URL: ${request.loadedUrl} | Page title: ${pageTitle}`);
    },
    // Comment this option to scrape the full website.
    maxRequestsPerCrawl: 20,
    maxRequestsPerMinute: 10,
    maxConcurrency: 1,
    minConcurrency: 1,
    sameDomainDelaySecs: 1,
});

// await crawler.run(startUrls);
await crawler.run(BBCNewsOnionStartUrls);

When I use the proxyConfiguration I run into the following error
ERROR PlaywrightCrawler: Request failed and reached maximum retries. page.goto: NS_ERROR_UNKNOWN_PROXY_HOST
However, when I remove it, everything seems to work okay.

So my question is why isn't
proxyConfiguration
needed in this case? Are all requests still being directed through the tor proxy I have running locally?

Thanks!

(I have verified the tor proxy is running via
curl --socks5-hostname localhost:9050 https://check.torproject.org/api/ip
Was this page helpful?