optimistic-gold•2y ago
Add certificates to Playwright crawler using Chromium
hey folks, we are trying to integrate a proxy into our crawlers and the issue is the proxy needs certificate to be present before it'll allow us to authenticate, I couldnt find any option for this in the documentation.
Is there a way I can add those certs in crawlee/playwright? or if crawlee exposes
agentOptions
from Playwright anywhere (couldn't find it in the docs), that'll also work as per https://github.com/microsoft/playwright/issues/1799#issuecomment-959011162GitHub
[Feature] allow client certificate selection and settings from Java...
Similarly to puppeteer/puppeteer#540 Currently when navigating to a page that requires client certificates and client certificates are available a popup is shown in Firefox and Chrome which asks to...
13 Replies
optimistic-goldOP•2y ago
P.S. I have added that certificate on my server and curl is working fine but crawlee is not
so I'm assuming crawlee is not picking it up
and the error I'm getting is
page.goto: net::ERR_PROXY_CONNECTION_FAILED at
Hi @AltairSama2 ,
Crawlee is using Playwright under the hood, so you should be able to intercept request in usual way. There I found an example for Playwright itself ( https://github.com/microsoft/playwright/issues/1799#issuecomment-959011162 ).
Can you do minimal working example using only Playwright (witohut Crawlee) to confirm that the issue is in Crawlee and not in Playwright itself? - I found a lot of issues regarding using certificated in Playwright.
GitHub
[Feature] allow client certificate selection and settings from Java...
Similarly to puppeteer/puppeteer#540 Currently when navigating to a page that requires client certificates and client certificates are available a popup is shown in Firefox and Chrome which asks to...
optimistic-goldOP•2y ago
hey, when I use playwright. it gives me a cert invalid error and which I can bypass by using
but with crawlee its not working
I think I got it wrong, I dont need to use the proxy's certificate with playwright/crawlee
its just a proxy config issue
page.goto: net::ERR_PROXY_CONNECTION_FAILED
here's the full error
we bypassed it by avoiding the cert route and its working fine for usDoes the same proxy configuration works for other websites?
optimistic-goldOP•2y ago
not with crawlee but with playwright yeah
it worked with crawlee once we full onboarded with the proxy provider and we didnt need to use their cert
@AltairSama2 Can you please provide code snippet with your current configration for Crawlee?
optimistic-goldOP•2y ago
@AltairSama2 Thank you for your feedback, I am currently investigating this with the Crawlee developer team.
Would it be possible to also provide us with the pure Playwright solution code, that is currently working for you? Is the certificate taken from system or are you importing it on application level?
optimistic-goldOP•2y ago
its taken from the system
@AltairSama2 just advanced to level 7! Thanks for your contributions! 🎉
optimistic-goldOP•2y ago
I was trying to figure out how to do it on an app level but couldnt make it work
but in the end system level worked fine
here's the pure playwright code
I think it was an issue on our end, because after full acc activation with the proxy provider, it worked just fine, only issues we are currently facing is that a lot of our requests are failing with the proxy but thats unrelated to this is probably a config issue
@AltairSama2 You should be able to replicate this event in Crawlee:
and drop the
proxyConfiguration
attributte.
And please let me know if it helped 🙂optimistic-goldOP•2y ago
hey thanks! really appreciate it
I can't repro the original issue because we are not relying on the certs anymore but this method is also working for us