CheerioCrawler mixed data when using $

Hi Team!

I'm pretty new to Crawlee and I'm experimenting with the

CheerioCrawler

CheerioCrawler

: crawl a website and store visited URLs with their title in a database.

However, I noticed that randomly,

$('title').text()

$('title').text()

returns the wrong data (probably from another Cheerio instance?).

I believe I'm doing something wrong since it seems to be the basic, so I'm sorry if this question has already been asked.

const crawler = new CheerioCrawler({
  requestHandler: async ({ $, enqueueLinks, request }) => {
    const { url } = request;
    const name = $("title").text();
    console.log(name, "-->", url);
    await enqueueLinks();
  },
});

await crawler.run(["https://www.example.com"]);

const crawler = new CheerioCrawler({
  requestHandler: async ({ $, enqueueLinks, request }) => {
    const { url } = request;
    const name = $("title").text();
    console.log(name, "-->", url);
    await enqueueLinks();
  },
});

await crawler.run(["https://www.example.com"]);

Would output:

1st run:

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Job positions --> https://www.example.com/internships/internship-software-development

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Job positions --> https://www.example.com/internships/internship-software-development

2nd run:

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Internship: Software Development --> https://www.example.com/internships/internship-software-development

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Internship: Software Development --> https://www.example.com/internships/internship-software-development

CheerioCrawler mixed data when using $

Hi Team!

I'm pretty new to Crawlee and I'm experimenting with the

CheerioCrawler

CheerioCrawler

: crawl a website and store visited URLs with their title in a database.

However, I noticed that randomly,

$('title').text()

$('title').text()

returns the wrong data (probably from another Cheerio instance?).

I believe I'm doing something wrong since it seems to be the basic, so I'm sorry if this question has already been asked.

const crawler = new CheerioCrawler({
  requestHandler: async ({ $, enqueueLinks, request }) => {
    const { url } = request;
    const name = $("title").text();
    console.log(name, "-->", url);
    await enqueueLinks();
  },
});

await crawler.run(["https://www.example.com"]);

const crawler = new CheerioCrawler({
  requestHandler: async ({ $, enqueueLinks, request }) => {
    const { url } = request;
    const name = $("title").text();
    console.log(name, "-->", url);
    await enqueueLinks();
  },
});

await crawler.run(["https://www.example.com"]);

Would output:

1st run:

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Job positions --> https://www.example.com/internships/internship-software-development

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Job positions --> https://www.example.com/internships/internship-software-development

2nd run:

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Internship: Software Development --> https://www.example.com/internships/internship-software-development

Digital Marketing Expert --> https://www.example.com/job-positions/digital-marketing-expert
Job positions --> https://www.example.com/job-positions
Internship: Software Development --> https://www.example.com/internships/internship-software-development

CheerioCrawler mixed data when using $

Similar Threads

CheerioCrawler mixed data when using $

Similar Threads

Similar Threads

Similar Threads