conscious-sapphire•3y ago
Puppeteer crawler loop elements lsist
router.addDefaultHandler(async ({ page, request, enqueueLinks,log }) => {
log.info(enqueueing new URLs);
await enqueueLinks({
selector:"div[role='article'] > a",
label: 'detail',//corresponding to handle for processing
});
});
how to I can crawler list item data in list div[role='article']👍
not get list url to add to queue

5 Replies
ambitious-aqua•3y ago
Try this:
This will scrape the
href
attribute from each anchor element. But within the for...of
loop, you can do anything.conscious-sapphireOP•3y ago
But I want use Pupeteer, because page render by ssjs
*js
not use Cheerio
You can use approach above.
parseWithCheerio()
is just a util method, that allows you to work with the data same way as with CheerioCrawler
:
https://crawlee.dev/api/next/puppeteer-crawler/interface/PuppeteerCrawlingContext#parseWithCheerioambitious-aqua•3y ago
The code above works with PuppeteerCrawler.
conscious-sapphireOP•3y ago
thank you