Apify Discord Mirror

Updated last year

Python logging equivalent

At a glance

The community member is porting existing Selenium-based scrapers to Crawlee and is looking for an equivalent to the verbose logging capabilities in Python. They found that Crawlee's logging does not include function names or line numbers, and are asking for a minimal example of how to extend the default loggers to add this information.

The community members discuss a few options:

  • One community member provides an example of extending the LoggerText class to create a custom logger with additional logging capabilities.
  • Another community member asks how to pass extra arguments, such as line number and function name, to the custom logger.
  • The community members discuss the limitations of the Crawlee logging system, noting that it is not possible to automatically retrieve information like line numbers or function names.
  • The community members suggest using a third-party logging library like Winston to handle more advanced logging requirements, such as log rotation.

There is no explicitly marked answer, but the community members provide guidance on how to extend the Crawlee logging system and use third-party libraries to achieve the desired logging capabilities.

Useful resources
Hey folks, I am porting our existing selenium based scrapers to crawlee and one thing I am used to is how verbose we can set the logging in python, is there an equivalent in crawlee?

for e.g.

Here's my logging format in python:
logging.Formatter("%(asctime)s %(name)s:%(levelname)s [%(funcName)s:%(lineno)d] %(message)s")

and we also have support for rotating logs in python.

i went through the docs and while I found how to add timestamp in the logs, there's no mention of func names or even code line numbers, The only thing I found is that we can extend default loggers in crawlee so can someone post a minimal example of how to extend these loggers to add in func names etc?

or is there support for third party logging libraries like winston that I can use for this? I am using PlayWright if that matters.
P
A
A
16 comments
Hi
There is an example of implementation of setting custom logger in Typescript based on LoggerText class. Check the LoggerText for other method that you might want to override like warning(...), error(...) :

Plain Text
import { log, LoggerText, LogLevel } from 'crawlee';

class MyLogger extends LoggerText {
    override log(level: LogLevel, message: string, data?: any, exception?: any) {
        console.log(`My own log `, level, message, data, exception);
    }

    // ...
}

log.setOptions({
    logger: new MyLogger(log.getOptions()),
});


Hope this helps.
hey, thanks, how would I go about passing in extra arguments for my logs? like say line no, module name, func name etc. will it be something like this?
Plain Text
class MyLogger extends LoggerText {
    override log(level: LogLevel, message: string, data?: any, exception?: any, extraArgs: any) {
        console.log(`My own log `, level, message, data, exception, extraArgs);
    }

    // ...
}

//calling it like this?
logger.info(message, data, extraArgs)
And can you link the tyepscript implementation you mentioned? I found the log implementation here https://github.com/apify/apify-shared-js/blob/master/packages/log/src/logger_text.ts and have been going through this package to figure out my own, is this the correct place because I couldn't find it in the crawlee repo or in the docs.

So there are "system" logs coming from Crawlee,, these will always use currently set Logger, which will always use the same method signatures and you cannot do much about it.
If you want to provide custom parameters to log, that is what the data parameter is for, but you need to always pass them by yourself.

Plain Text
log.info(`MY log message`, {
    myData1: 'data1',
    myData2: 'data2'
})


I am afraid that there is no way for JS/TypeScript to provide information about line number, or name of the parent method from which they were called etc.

I originally thought, that you are just interested in a way the logs are formatted
gotcha, thanks for the info! this is really helpful, I'll hack together a solution that works for myself thanks
and I'm assuming I'll have to implement rotating logs etc on my own right?
just advanced to level 1! Thanks for your contributions! πŸŽ‰
using the example above?
do we have an option where I can write the logs to a buffer or anything which in turn will update the .log file?
or is it like open the buffer in code
and then pass it in using data?
If you want to solve this on application level then yes, I am afraid you need to solve such cases by yourself.

What I put as example is just a middleware for the logs, so I believe you might use some 3rd party solution for logs in it, such as Winston ( https://stackoverflow.com/questions/18055971/log-rotation-in-node-js ) which may cover some of the cases like logging into file based etc.
ah got it, so in the example you linked, we can omit the console.log which does the actual logging and instead pass in these parameters to Winston or something which will do this for us?
this'll really help a lot
Yes, something like that.
thanks for the help!
Add a reply
Sign up and join the conversation on Discord