cloudy-cyan
cloudy-cyan3y ago

Playwright Crawler fails on undefined page

Hello there! I just build my first actor using the apify cli. I chose to use a typescript playwright crawler. It by default uses the createPlaywrightRouter() function to create a router and pass it to the requestHandler of the PlaywrightCrawler. All seems well, and according to typescript, I should be able to access a page object in the handler. (I'm only using the addDefaultHandler) However, when I run the actor on the Apify platform it fails with the following exception:
2023-02-09T15:16:45.925Z INFO PlaywrightCrawler: Start of default handler
2023-02-09T15:16:45.932Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. page.evaluate: ReferenceError: page is not defined
2023-02-09T15:16:45.934Z at eval (eval at evaluate (:197:30), <anonymous>:5:13)
2023-02-09T15:16:45.936Z at UtilityScript.evaluate (<anonymous>:199:17)
2023-02-09T15:16:45.939Z at UtilityScript.<anonymous> (<anonymous>:1:44)
2023-02-09T15:16:45.925Z INFO PlaywrightCrawler: Start of default handler
2023-02-09T15:16:45.932Z WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. page.evaluate: ReferenceError: page is not defined
2023-02-09T15:16:45.934Z at eval (eval at evaluate (:197:30), <anonymous>:5:13)
2023-02-09T15:16:45.936Z at UtilityScript.evaluate (<anonymous>:199:17)
2023-02-09T15:16:45.939Z at UtilityScript.<anonymous> (<anonymous>:1:44)
So it seems page is not defined. I can't seem to find anything in the docs about this. Am I missing something during initialisation? Any help would be greatly appreciated!
3 Replies
HonzaS
HonzaS3y ago
and did you destructure the page in the default handler definition? can you show the code of the defaultHandler?
cloudy-cyan
cloudy-cyanOP3y ago
Yes, currently I'm doing the destructuring in the function itself, so I could log the whole context object. But I encountered this error while I was destructuring it in the function arguments as well.
import { createPlaywrightRouter } from 'crawlee'
import { Actor } from 'apify'

export const router = createPlaywrightRouter()

router.addDefaultHandler(async (context) => {
const { request, page, log, enqueueLinks } = context
// scroll to the bottom of the page
log.info('Start of default handler', { context })
await page.evaluate(async () => {
let lastHeight = 0
let i = 0
while (i < 5) {
await page.mouse.wheel(0, 10000)
await page.waitForLoadState('networkidle', { timeout: 10000 })
i++
}
})

})
import { createPlaywrightRouter } from 'crawlee'
import { Actor } from 'apify'

export const router = createPlaywrightRouter()

router.addDefaultHandler(async (context) => {
const { request, page, log, enqueueLinks } = context
// scroll to the bottom of the page
log.info('Start of default handler', { context })
await page.evaluate(async () => {
let lastHeight = 0
let i = 0
while (i < 5) {
await page.mouse.wheel(0, 10000)
await page.waitForLoadState('networkidle', { timeout: 10000 })
i++
}
})

})
The code itself is just to very generally scroll to the bottom and wait for things to load. Like infinite scroll, but I want it to work on websites without any custom configuration.
HonzaS
HonzaS3y ago
you are referencing page inside the evaluate function this will not work you can see docs here https://playwright.dev/docs/evaluating evaluate run in browser enviroment so there is no page object

Did you find this page helpful?