Apify Community

scraping continuing to next set of leads after a new run?

hey guys, not the most advanced apify user so i need help with scraping leads. Issue is i crape the max 5 k leads then try to restart the scraper and it rescrapes the same 5 k leads. how can i get it to scrape the next 5k leads

--josh-

Is it possible to setup a slack notification for a low account balance?

We're using the proxy feature and our usage is somewhat difficult to predict, we'd like to either be notified via slack when our account balance goes below a given threshold OR we'd like to setup automatic account balance top offs.

Are either of these possible?

sshinobi

Solved

do I need to signup for a paid plan, if my requirement is only one time?

I can just use the free plan and refill some credits for pay as you go ? Becuase my usage is only one time thing.

2 comments

!!!!Joefree!!! 👑

console WEBUI cosmetic BUG: icon with black border

When using a transparent icon for the actor (WEBP or PNG images), an unexpected black border appears (on Google Chrome 80% Zoom)

1 comment

__bms.

Testing my first actor

Hi there. I am coming from scraperPAI solutions and I am having issues w/ them. I just want to try Apify.
I am trying to build my firt actor without any succeed currently.
The test actor sample offers a full example. Sounds great but I get error when I try to use another URL than the one proposed by default (https://www.apify.com) I get an error. For example I try the following https://fr.indeed.com and I get an error. Any idea?

1 comment

AAmmarSalmiDz

Apify cli can't find python

When I use apify run it say python can't be detected. It's installed and it's in PATH variable and everything and works from cmd and powershell like charm. Also, I updated node and npm to the latest version and reinstalled apify-cli

bbillsauce

error

hi why do i always get this error: raise ApifyApiError(response, attempt)
apify_client._errors.ApifyApiError: You must rent a paid Actor in order to run it. i have apify pro

�🟢mido🟢

Apify Proxy not working with https urls

I want to test the apify proxy and how it works to integrate it with my py code.
Running a very simple check I found it's not working with https urls. here's a snippet:

Plain Text

import asyncio, httpx
from apify import Actor
import dotenv

async def main():
    async with Actor:
        proxy_configuration = await Actor.create_proxy_configuration(
            password=dotenv.get_key('.env', 'APIFY_PROXY_PASSWORD'),
        )
        proxy_url = await proxy_configuration.new_url()
        proxies = {
            'http': proxy_url,
            'https': proxy_url,
        }
        async with httpx.AsyncClient(proxy=proxy_url) as client:
            for _ in range(3):
                response = await client.get('https://httpbin.org/ip')
                if response.status_code == 200:
                    print(response.json())
                elif response:
                    print(response.text)

if __name__ == '__main__':
    asyncio.run(main())

giveing me a proxy error:

Plain Text

          raise mapped_exc(message) from exc
      httpx.ReadTimeout
[apify] INFO  Exiting Actor ({"exit_code": 91})

If i just only change the protocol to http://httpbin.org/ip it works.
Apify proxy should support https as stated on the site. Thanks in advance.

3 comments

�

3333mmm333

nested tranformation

What is wrong with my transformation?
everything under physicianInfo is not beeing displayed on joboverview

DDuxSec

Solved

Double log output

in main.py logging works as expected, however in routes.py logging is printed twice for some reason.
I did not setup any custom logging, I just use
Actor.log.info("STARTING A NEW CRAWL JOB")

example:

Plain Text

[apify] INFO  Checking item 17
[apify] INFO  Checking item 17 ({"message": "Checking item 17"})
[apify] INFO  Processing new item with index: 17
[apify] INFO  Processing new item with index: 17 ({"message": "Processing new item with index: 17"})

If I add this in my main.py (https://docs.apify.com/sdk/python/docs/concepts/logging)

Plain Text

async def main() -> None:
    async with Actor:
        ##### SETUP LOGGING #####
        handler = logging.StreamHandler()
        handler.setFormatter(ActorLogFormatter())

        apify_logger = logging.getLogger('apify')
        apify_logger.setLevel(logging.DEBUG)
        apify_logger.addHandler(handler)

it prints everything from main.py 2x, and everything from routes.py 3x.

Plain Text

[apify] INFO  STARTING A NEW CRAWL JOB
[apify] INFO  STARTING A NEW CRAWL JOB ({"message": "STARTING A NEW CRAWL JOB"})
[apify] INFO  STARTING A NEW CRAWL JOB ({"message": "STARTING A NEW CRAWL JOB"})

11 comments

JJerome

Actor pay per event

Hi, I've seen mentions of a "pay per event" pricing model https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event & https://apify.com/mhamas/pay-per-event-example, but can't find how to use it for one of my actor, i only see rental or pay per result options.
How can we use this pay per event pricing model?

8 comments

DDohnos

Google Lens - extract product name

Hello, I would like to ask if any Apify tool can, for example, find a similar image - https://i.postimg.cc/KzRHFKQc/55.jpg and extract the product name from the links to CSV. We can use Google Lens? I want to use this to automatically name antique products.

Thanks for the all informations and help! 👋

1 comment

CCool Beans

How can I publish my Actor in the store?

I have created a scraper but am having issues posting it to the store. I opened my account 2 days ago and would like to start earning money on my scraper

Comment

--josh-

Solved

403 response for new account when using proxy

I'm attempting to validate that the proxy works and am not having luck, should I expect the following to work?

Plain Text

~ λ curl --proxy http://proxy.apify.com:8000  -U 'groups-RESIDENTIAL,country-US:apify_proxy_redacted' -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"  https://httpbin.org/ip
curl: (56) CONNECT tunnel failed, response 403

3 comments

yyueyawochong

I Can't public my actor

my account: https://apify.com/wudizhangzhi
actors: https://console.apify.com/actors/KAkfFaz8JVdvOQQ5F/source

Error: Operation failed! (You currently don’t have the necessary permissions to publish an Actor. This is expected behavior. Please contact support for assistance in resolving the issue.)

@Saurav Jain

2 comments

PParvinder

Where to get Personal ID or Business ID?

Guys im new to apify and i want to publish my newly built job scraper but when i setup monetization there are two business id and personal id option, where can i get this?

1 comment

CCuivrax

Google Maps crawler - Increasing place limit after initial run

Hi everyone,
I recently ran a Google Maps scraper (https://apify.com/compass/crawler-google-places) to collect place data, and I've discovered that there are many more places available than what was initially collected in my first run.
Current Situation:

Successfully completed an initial scrape
Have collected data for X places
Discovered there are significantly more places available
Already have a dataset from the first run

Questions:
Is it possible to increase the place limit on my existing run configuration?
If I need to create a new run, what's the best way to:

Import/merge my existing scraped data
Avoid duplicating places already collected
Continue from where the previous run stopped

Any guidance on the most efficient approach would be greatly appreciated.
Thanks in advance!

4 comments

Apify Discord Mirror

scraping continuing to next set of leads after a new run?

Is it possible to setup a slack notification for a low account balance?

do I need to signup for a paid plan, if my requirement is only one time?

console WEBUI cosmetic BUG: icon with black border

Testing my first actor

Apify cli can't find python

error

Apify Proxy not working with https urls

nested tranformation

Double log output

Actor pay per event

Google Lens - extract product name

How can I publish my Actor in the store?

403 response for new account when using proxy

I Can't public my actor

Where to get Personal ID or Business ID?

Google Maps crawler - Increasing place limit after initial run