ambitious-aqua•3y ago
Extracting text from list elements
I want to extract the text from all <li> elements inside an unordered list <ul>.
Trying
await page.locator("div.my_class > ul > li").textContent();
causes an error: strict mode violation: locator('div.my_class > ul > li') resolved to x elements
. The presence of multiple elements is expected since this is a list.
Playwright itself doesn't seem to have an issue with selectors that return multiple elements, and I did find the strictSelectors
parameter in the crawlee docs, but didn't manage to set it to false (if that is even the solution).
In scrapy item.add_css("list", "div.my_class > ul > li::text")
returns a list of the text for each list item, which is what I'm looking for.
Does anyone know how to solve this?2 Replies
you can try to use crawlee function https://crawlee.dev/api/playwright-crawler/interface/PlaywrightCrawlingContext#parseWithCheerio
and then extract it with cheerio functions
or you can use
https://playwright.dev/docs/api/class-page#page-eval-on-selector-all
await page.$$eval('div.my_class > ul > li', (els)=>els.map((x)=>x.textContent))
writing it from my head so not sure it is exactly right, but something like this should work
or as is written in the docs you can try the same with
https://playwright.dev/docs/api/class-locator#locator-evaluate-allambitious-aquaOP•3y ago
Thanks @HonzaS, using
$$eval
works: