absent-sapphire
absent-sapphire8mo ago

Double log output

in main.py logging works as expected, however in routes.py logging is printed twice for some reason. I did not setup any custom logging, I just use Actor.log.info("STARTING A NEW CRAWL JOB") example:
[apify] INFO Checking item 17
[apify] INFO Checking item 17 ({"message": "Checking item 17"})
[apify] INFO Processing new item with index: 17
[apify] INFO Processing new item with index: 17 ({"message": "Processing new item with index: 17"})
[apify] INFO Checking item 17
[apify] INFO Checking item 17 ({"message": "Checking item 17"})
[apify] INFO Processing new item with index: 17
[apify] INFO Processing new item with index: 17 ({"message": "Processing new item with index: 17"})
If I add this in my main.py (https://docs.apify.com/sdk/python/docs/concepts/logging)
async def main() -> None:
async with Actor:
##### SETUP LOGGING #####
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())

apify_logger = logging.getLogger('apify')
apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)
async def main() -> None:
async with Actor:
##### SETUP LOGGING #####
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())

apify_logger = logging.getLogger('apify')
apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)
it prints everything from main.py 2x, and everything from routes.py 3x.
[apify] INFO STARTING A NEW CRAWL JOB
[apify] INFO STARTING A NEW CRAWL JOB ({"message": "STARTING A NEW CRAWL JOB"})
[apify] INFO STARTING A NEW CRAWL JOB ({"message": "STARTING A NEW CRAWL JOB"})
[apify] INFO STARTING A NEW CRAWL JOB
[apify] INFO STARTING A NEW CRAWL JOB ({"message": "STARTING A NEW CRAWL JOB"})
[apify] INFO STARTING A NEW CRAWL JOB ({"message": "STARTING A NEW CRAWL JOB"})
10 Replies
Hall
Hall8mo ago
Someone will reply to you shortly. In the meantime, this might help: -# This post was marked as solved by DuxSec. View answer.
Exp
Exp8mo ago
Hi, modify your logging setup in main.py async def main() -> None: async with Actor: apify_logger = logging.getLogger('apify') if not apify_logger.hasHandlers(): handler = logging.StreamHandler() handler.setFormatter(ActorLogFormatter()) apify_logger.setLevel(logging.DEBUG) apify_logger.addHandler(handler)
absent-sapphire
absent-sapphireOP8mo ago
Thanks for the reply, if I use your code, it still logs everything in routes.py 2x... (it works in main.py correctly)
Exp
Exp8mo ago
If you set up logging in a separate file (logging_setup.py), then import it everywhere, it ensures consistency.
absent-sapphire
absent-sapphireOP8mo ago
still the same... main printed OK, but routes printed 2x main.py
async def main() -> None:
async with Actor:
##### SETUP LOGGING #####
await setup_logging()
...
async def main() -> None:
async with Actor:
##### SETUP LOGGING #####
await setup_logging()
...
utils.py
from apify.log import ActorLogFormatter
import logging

async def setup_logging():
apify_logger = logging.getLogger('apify')
if not apify_logger.hasHandlers():
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())

apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)
from apify.log import ActorLogFormatter
import logging

async def setup_logging():
apify_logger = logging.getLogger('apify')
if not apify_logger.hasHandlers():
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())

apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)
routes.py
@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
await setup_logging()

Actor.log.info("STARTING A NEW CRAWL JOB")
@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
await setup_logging()

Actor.log.info("STARTING A NEW CRAWL JOB")
Exp
Exp8mo ago
The issue is that each time default_handler is called in routes.py, it re-adds a new log handler. Since setup_logging() runs inside default_handler, new handlers keep stacking up, leading to duplicated logs. fix utils.py from apify.log import ActorLogFormatter import logging LOGGING_INITIALIZED = False async def setup_logging(): global LOGGING_INITIALIZED if LOGGING_INITIALIZED: return apify_logger = logging.getLogger('apify') if not apify_logger.hasHandlers(): handler = logging.StreamHandler() handler.setFormatter(ActorLogFormatter()) apify_logger.setLevel(logging.DEBUG) apify_logger.addHandler(handler) LOGGING_INITIALIZED = True Fix routes.py from utils import setup_logging from apify import Actor @router.default_handler async def default_handler(context: PlaywrightCrawlingContext) -> None: await setup_logging() Actor.log.info("STARTING A NEW CRAWL JOB")
absent-sapphire
absent-sapphireOP8mo ago
ah I see, I implemented your code, but still same issue... (routes.py does see LOGGING_INITIALIZED correctly and returns early) ---- I created a sample project with the minimal code to reproduce the issue main.py
from apify import Actor
from crawlee.crawlers import PlaywrightCrawler
from src.routes import router
from src.utils import setup_logging

async def main() -> None:
async with Actor:
await setup_logging()
crawler = PlaywrightCrawler(headless=False, request_handler=router)
await crawler.run(["https://apify.com"])
from apify import Actor
from crawlee.crawlers import PlaywrightCrawler
from src.routes import router
from src.utils import setup_logging

async def main() -> None:
async with Actor:
await setup_logging()
crawler = PlaywrightCrawler(headless=False, request_handler=router)
await crawler.run(["https://apify.com"])
routes.py
from crawlee.crawlers import PlaywrightCrawlingContext
from crawlee.router import Router
from apify import Actor
from src.utils import setup_logging

router = Router[PlaywrightCrawlingContext]()

@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
await setup_logging()
Actor.log.info("STARTING A NEW CRAWL JOB")
from crawlee.crawlers import PlaywrightCrawlingContext
from crawlee.router import Router
from apify import Actor
from src.utils import setup_logging

router = Router[PlaywrightCrawlingContext]()

@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
await setup_logging()
Actor.log.info("STARTING A NEW CRAWL JOB")
utils.py
from apify.log import ActorLogFormatter
import logging

LOGGING_INITIALIZED = False

async def setup_logging():
global LOGGING_INITIALIZED
if LOGGING_INITIALIZED:
return

apify_logger = logging.getLogger('apify')

if not apify_logger.hasHandlers():
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())

apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)

LOGGING_INITIALIZED = True
from apify.log import ActorLogFormatter
import logging

LOGGING_INITIALIZED = False

async def setup_logging():
global LOGGING_INITIALIZED
if LOGGING_INITIALIZED:
return

apify_logger = logging.getLogger('apify')

if not apify_logger.hasHandlers():
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())

apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)

LOGGING_INITIALIZED = True
Exp
Exp8mo ago
Add following code in utils.py logging.getLogger().handlers.clear() apify_logger = logging.getLogger("apify") apify_logger.setLevel(logging.DEBUG)
absent-sapphire
absent-sapphireOP8mo ago
that fixed it! thank you so much
Exp
Exp8mo ago
I am glad it's resolved

Did you find this page helpful?