rival-black
rival-black2y ago

change INFO CheerioCrawler

Hello, is it possible to change the standard INFO output of CheerioCrawler? I looked at the documentation but these methods didn't work https://crawlee.dev/api/core/class/Log
Log | API | Crawlee
The log instance enables level aware logging of messages and we advise to use it instead of console.log() and its aliases in most development scenarios. A very useful use case for log is using log.debug liberally throughout the codebase to get useful logging messages only when appropriate log level is set and keeping the console tidy in p...
5 Replies
lemurio
lemurio2y ago
hi, could you please specify what exactly didn't work or describe what you're trying to achieve?
rival-black
rival-blackOP2y ago
hi @lemurio , thanks for the answer, I meant that when I start сheerio, or basic crawler, I see log message INFO BasicCrawler: Starting The Crawler, and other messages. I was able to slightly modify the BasicCrawler class to receive the CustomCrawler INFO message. But I would like to change this using the documentation. Is there any information on this in the documentation? I see that I can add custom messages in addition, but I did not find a specific change to the info message. I'm making my crawler based on crawlee so I need to change the base messages
rival-black
rival-blackOP2y ago
https://crawlee.dev/docs/examples/basic-crawler , https://crawlee.dev/docs/examples/cheerio-crawler write me pls if you know how to change these standard messages using the example of standard options
Basic crawler | Crawlee
This is the most bare-bones example of using Crawlee, which demonstrates some of its building blocks such as the BasicCrawler. You probably don't need to go this deep though, and it would be better to start with one of the full-featured crawlers
Cheerio crawler | Crawlee
This example demonstrates how to use CheerioCrawler to crawl a list of URLs from an external file, load each URL using a plain HTTP request, parse the HTML using the Cheerio library and extract some data from it: the page title and all h1 tags.
Lukas Krivka
Lukas Krivka2y ago
You can also monkeypatch it
const origLog = crawler.log.info.bind(crawler)
crawler.log.info = (message) => {
// do what you want
}
const origLog = crawler.log.info.bind(crawler)
crawler.log.info = (message) => {
// do what you want
}
rival-black
rival-blackOP2y ago
@Lukas Krivka ok I'll try thanks a lot Lukas

Did you find this page helpful?