Apify & Crawlee

AC

Apify & Crawlee

This is the official developer community of Apify and Crawlee.

Join

crawlee-js

apify-platform

crawlee-python

💻hire-freelancers

🚀actor-promotion

💫feature-request

💻creators-and-apify

🗣general-chat

🎁giveaways

programming-memes

🌐apify-announcements

🕷crawlee-announcements

👥community

foreign-sapphire
foreign-sapphire8/1/2024

Question for any team members: if you

Question for any team members: if you scrape using the residential proxy when required for FB/IG is it possible to get a handle on the costs because I know it looks expensive at 10/gb! @netmilk @vladdy this is a slight concern ....
fascinating-indigo
fascinating-indigo7/31/2024

I need help, I have the Scale plan and

I need help, I have the Scale plan and out of nowhere "You do not have permission to run this public Actor."
foreign-sapphire
foreign-sapphire7/28/2024

Any members of the team: I'd like to

Any members of the team: I'd like to know what happens if we use a lot of resources . How long do we have to pay the bill or do we need to always have enough pre-paid credit on hand? @vladdy @JameEnder let me know this will help us plan better thanks.
xenial-black
xenial-black7/19/2024

you need to explicitly pass the

you need to explicitly pass the requestQueueId when starting the actor.
rare-sapphire
rare-sapphire7/18/2024

in the python sdk is there a way to get

in the python sdk is there a way to get the number of links scraped so far?
genetic-orange
genetic-orange7/18/2024

Hi! I'm building something using the

Hi! I'm building something using the apify sdk for crawling. I'm currently trying to figure out how I can tell the actor which URLs to skip during recrawls. Is the excludeUrlGlobs the right input setting for this? Is there a limit on exclusions? The plan is to regularily crawl news sites but i would like to only process something if there is new (not previously visited urls) data found....
afraid-scarlet
afraid-scarlet7/10/2024

I have the problem with Actor Instagram

I have the problem with Actor Instagram Scraper
fascinating-indigo
fascinating-indigo5/28/2024

Hello guys,

Hello guys, I have pupeteerCrawler in the requestHandler I'm trying to click to the pagination next button and I cannot determine if the content is changed or not. How can I do it? waitfornetworkidle does not seem to work here. any ideas?...
No description
genetic-orange
genetic-orange5/23/2024

Support regarding website content crawlrr

Hello, I'm using apify/website-content-crawler and want to render javascript before the crawling process. According the documentation, the related property for this settings is crawlerType, and it says if I choose Headless Browser, I can render javascript....
No description
graceful-blue
graceful-blue5/23/2024

Hi there! Does anybody know how to

Hi there! Does anybody know how to increase HTTPCrawler Requests timeouts? According to the docs there is requestQueue.timeoutSecs property but even when it is set to e.g. 60 secs all my HTTPCrawler requests are failed after 30 sec timeout 😦

I have made a scrapper. and I have

I have made a scrapper. and I have written code in python. and I am using Flask for the server. I have several routes. let say. 1. route1...
rare-sapphire
rare-sapphire5/12/2024

Excuse me, i have a problem.If i want to

Excuse me, i have a problem.If i want to use Twitter followers scraper, need i pay the actor fee 25$/month as well as apify platform starter plan 49$/month totally 74$/month?
ambitious-aqua
ambitious-aqua5/3/2024

Hi I am using Smart Article Extractor

Hi I am using Smart Article Extractor actor for extracting info in form of json from an article URL, now upon running it on postman, the actor runs flawlessly on apify console but fails to provide any response on postman with 201, how can i get response on it, please help
xenial-black
xenial-black4/29/2024

The builtwith technology scraper

The builtwith technology scraper
equal-aqua
equal-aqua4/25/2024

Hi, I am currently trying to split up

Hi, I am currently trying to split up the routes of my playwrightrouter into seperate files. But how do i do this? File1: export const router = createPlaywrightRouter(); File2: router.addHandler does not do the job for me. I guess the reason is that File2 never gets excuted.
rival-black
rival-black4/24/2024

Hello everyone,

Hello everyone, I would assume ther is a Custom GPT that can be asked all kinds of questions about Apify and its Agents, trained from the documentation and more resources, is there anything like it ? looking forward to learn more...
equal-aqua
equal-aqua4/20/2024

Hi, I am building a webscraper and I

Hi, I am building a webscraper and I want to use a kind of persistent cache to determine which links I have scraped recently in other run-throughs and which not. Is the best solution for this to use the KeyValueStore .getValue and KeyValueStore .setValue during the requestHandlers ?
stormy-gold
stormy-gold4/20/2024

Hey does anyone know if i can

Hey does anyone know if i can programatically change the proxy ? like on certain conditions met i want to change the next random proxy.
genetic-orange
genetic-orange4/18/2024

hi, thank you for the tool

hi, thank you for the tool PlaywrightCrawler. I want to ask How to handle 429 status code caused by requesting too often? Is there a sleep-for-a-few-seconds method to handle this? Thanks for the attention.
extended-salmon
extended-salmon4/11/2024

ERROR Actor failed with an exception

ERROR Actor failed with an exception
Traceback (most recent call last):
File "/usr/src/app/src/main.py", line 150, in main
actor_input = await Actor.get_input()
^^^^^^^^^^^^^^^^^^^^^^^ ...