rival-black
rival-black•17mo ago

Want to scrap multiple elements on a web page

import { PlaywrightCrawler } from "crawlee";

const crawler = new PlaywrightCrawler({
requestHandler: async ({ page, request, enqueueLinks }) => {
console.log(`Processing: ${request.url}`);
console.log(request.label);

const title = await page
.locator("span.nm-collections-row-name")
.textContent();

const results = {
title,
};

console.log(results);
},

// Let's limit our crawls to make our tests shorter and safer.
maxRequestsPerCrawl: 20,
});

await crawler.run(["https://www.netflix.com/in/browse/genre/1191605"]);
import { PlaywrightCrawler } from "crawlee";

const crawler = new PlaywrightCrawler({
requestHandler: async ({ page, request, enqueueLinks }) => {
console.log(`Processing: ${request.url}`);
console.log(request.label);

const title = await page
.locator("span.nm-collections-row-name")
.textContent();

const results = {
title,
};

console.log(results);
},

// Let's limit our crawls to make our tests shorter and safer.
maxRequestsPerCrawl: 20,
});

await crawler.run(["https://www.netflix.com/in/browse/genre/1191605"]);
This is the code. Since there are multiple span tags in the web page so it is showing error. I want to get all the span tags. Can anyone help me
4 Replies
Saurav Jain
Saurav Jain•17mo ago
maybe by this:
// Retrieve all span elements with the specified class
const titles = await page.$$eval("span.nm-collections-row-name", elements => {
// Map the text content of each element
return elements.map(el => el.textContent.trim());
});
// Retrieve all span elements with the specified class
const titles = await page.$$eval("span.nm-collections-row-name", elements => {
// Map the text content of each element
return elements.map(el => el.textContent.trim());
});
rival-black
rival-blackOP•17mo ago
Okay. Let me try it out It worked. Thank you so much Saurav
MEE6
MEE6•17mo ago
@Ayush Thakur just advanced to level 1! Thanks for your contributions! 🎉
unwilling-turquoise
unwilling-turquoise•16mo ago
i m newbie to crawlee, may I know what is the $$ mean in this case? why not just page.eval? Any link on the documentation?

Did you find this page helpful?