Apify Discord Mirror

Updated 2 weeks ago

how to pass data to routes.py

If i use multiple files, what is the best way to pass data (user input, which contains 'max_results' or something) to my routes.py?

example snippet main.py
Plain Text
        max_results = 5 # example

        crawler = PlaywrightCrawler(
            headless=False, 
            request_handler=router,
        )
        await crawler.run([start_url])


snippet routes.py
Plain Text
@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
    max_results = ???
D
M
В
3 comments
Is this good?
Plain Text
        request = Request.from_url(
            url=start_url,
            user_data={
                "max_results": max_results,
            }
        )

        print(start_urls)
        crawler = PlaywrightCrawler(
            headless=False, 
            request_handler=router,
        )
        await crawler.run([request])



Plain Text
@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
    max_results = context.request.user_data.get('max_results')
    print(f"Max results: {max_results}")
Yes, I myself often use this exact approach, passing data through user_data.
Pass max_results via a shared configuration module
Create a config.py file to store global configuration variables that both main.py and routes.py can access.

Example:
config.py
max_results = 5 # Default value


main.py
import config
from playwright_crawler import PlaywrightCrawler
from routes import router

config.max_results = 5 # Set max_results dynamically

crawler = PlaywrightCrawler(
headless=False,
request_handler=router,
)
await crawler.run([start_url])


routes.py
import config
from playwright_crawler import PlaywrightCrawlingContext

@router.default_handler
async def default_handler(context: PlaywrightCrawlingContext) -> None:
max_results = config.max_results
print(f"Max results: {max_results}")
Add a reply
Sign up and join the conversation on Discord