Apify Discord Mirror

Updated 2 years ago

Unable to run crawlee in aws lambda (Protocol error (Target.setAutoAttach): Target closed)

At a glance

A community member is trying to run the Crawlee library on AWS Lambda but is encountering an error: "Reclaiming failed request back to the list or queue. Protocol error (Target.setAutoAttach): Target closed." They are using Chromium version 109 and Node.js version 16.

Another community member suggests checking a related GitHub issue, but this did not work for the original poster. The original poster has added specific dependencies to their project, including "@sparticuz/chromium" to support Node.js v16, as the "chrome-aws-lambda" library is not compatible with that version.

The comments also discuss the "launcher: puppeteer" configuration, with one community member asking if it refers to an existing Puppeteer instance or browser session, but the original poster is unsure of the meaning of this comment.

There is no explicitly marked answer in the comments.

Useful resources
I am trying to run crawlee on aws lambda but getting this error message: Reclaiming failed request back to the list or queue. Protocol error (Target.setAutoAttach): Target closed.
chromium version: 109
node version: 16
code:
Plain Text
exports.handler = async (event, context, callback) => {
    const finalResult = [];
    const url = ``;

    try {
        const crawler = new PuppeteerCrawler({
            launchContext: {
                useIncognitoPages: true,
                launchOptions: {
                    executablePath: await chromium.executablePath(),
                    args: ['--no-sandbox', '--disable-setuid-sandbox']
                },
                launcher: puppeteer
            },
            useSessionPool: true,
            requestHandlerTimeoutSecs: 60, 
            browserPoolOptions: {
                useFingerprints: true,
                fingerprintOptions: {
                    fingerprintGeneratorOptions: {
                        browsers: ['chrome'],
                        operatingSystems: ['windows'],
                        devices: ['desktop'],
                        locales: ['en-US', 'en']
                    },
                },
            },
            headless: true,

            async requestHandler({ request, page, enqueueLinks }) {
                log.info(`Processing ${request.url}...`);

            },

            // This function is called if the page processing failed more than maxRequestRetries+1 times.
            failedRequestHandler({ request }) {
                log.error(`Request ${request.url} failed too many times.`);
            },
        });

        // Run the crawler and wait for it to finish.
        await crawler.run([url]);
        log.info('Crawler finished.');

    } catch (error) {
        return callback(error);
    } finally {

    }
    return callback(null, finalResult);
};
A
L
p
6 comments
Unable to run crawlee in aws lambda (Protocol error (Target.setAutoAttach): Target closed)
No it did not work for me.
Plain Text
"dependencies": {
    "@sparticuz/chromium": "^109.0.1",
    "crawlee": "^3.1.4",
    "puppeteer-core": "^19.4.0",
    "puppeteer-extra": "^3.3.4",
    "puppeteer-extra-plugin-stealth": "^2.11.1"
  }

I am running crawlee on nodejs v16 and for that chrome-aws-lambda is not supported. Hence I have added @sparticuz/chromium which supports node v16
is launcher: puppeteer referring to an existing puppeteer instance/browser session
Sorry did not get you what do you mean by that? Are you able to run crawlee in aws lambda?
Not browser, it should be puppeteer launcher variable
Add a reply
Sign up and join the conversation on Discord