Not working on all elements in the request_queue?
Hello - i try to run the following apify-code and as you can see in my example code below i would like to create a queue with 6 elements / combinations:
[('electrician', 'Southampton,England,United Kingdom'), ('electrician', 'Luton,England,United Kingdom'), ('electrician', 'Portsmouth,England,United Kingdom'), ('accountant', 'Southampton,England,United Kingdom'), ('accountant', 'Luton,England,United Kingdom'), ('accountant', 'Portsmouth,England,United Kingdom')]
But when i run the code only the first element is executed from the queue and then the actor is exiting:
[apify] INFO [Status message]: Starting actor...
[apify] INFO [('electrician', 'Southampton,England,United Kingdom'), ('electrician', 'Luton,England,United Kingdom'), ('electrician', 'Portsmouth,England,United Kingdom'), ('accountant', 'Southampton,England,United Kingdom'), ('accountant', 'Luton,England,United Kingdom'), ('accountant', 'Portsmouth,England,United Kingdom')]
[apify] INFO 6
return self.serializer.to_python(
[apify] INFO Working for electrician in Southampton,England,United Kingdom
[apify] INFO [Status message]: Actor finished...
[apify] INFO Exiting Actor ({"exit_code": 0})
['electrician', 'Southampton,England,United Kingdom', 'Cablefrog Electrical']
This is the way i build the queue:
...
inpCombinations = list(itertools.product(inpSearchWords, inpSearchLoc))
Actor.log.info(inpCombinations)
Actor.log.info(len(inpCombinations))
baseLink = "https://www.bing.com/maps"
request_queue = await Actor.open_request_queue()
for i,e in enumerate(inpCombinations):
newReq = Request.from_url(f"{baseLink}#{i}") newReq.user_data = {"search": list(e)}
await request_queue.add_request(newReq)
... And thats the way i try to run trough it: ... while request := await request_queue.fetch_next_request():
....
for i,e in enumerate(inpCombinations):
newReq = Request.from_url(f"{baseLink}#{i}") newReq.user_data = {"search": list(e)}
await request_queue.add_request(newReq)
... And thats the way i try to run trough it: ... while request := await request_queue.fetch_next_request():
....
3 Replies
Hi! The code you sent is logically correct and should work, but only one request is processed because you use
URLs
like {baseLink}#{i}
. After normalization, the unique_key
becomes identical for all these URLs
, so only one request gets handled. To fix this, explicitly set a unique unique_key
for each request. You can read more here: https://docs.apify.com/sdk/python/reference/class/Request#unique_key.Hello - thanks for the responsen - i tried it now with this code:
request_queue = await Actor.open_request_queue()
for i,e in enumerate(inpCombinations, start=1):
newReq = Request.from_url(f"{baseLink}#{i}") newReq.user_data = {"search": list(e)} newReq.unique_key = f"entry{i}"
await request_queue.add_request(newReq) But now i get this error 4 times: The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=wX3ZAQpcaw5bKtW). The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=rSBjdBzOLZ8oYrB). The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=pnGkgaDtyM1uq3Z). The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=gffttf8wTy9MbaR).
for i,e in enumerate(inpCombinations, start=1):
newReq = Request.from_url(f"{baseLink}#{i}") newReq.user_data = {"search": list(e)} newReq.unique_key = f"entry{i}"
await request_queue.add_request(newReq) But now i get this error 4 times: The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=wX3ZAQpcaw5bKtW). The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=rSBjdBzOLZ8oYrB). The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=pnGkgaDtyM1uq3Z). The request ID does not match the ID from the unique_key (request.id=sLItjWV1tucXctA, id=gffttf8wTy9MbaR).
Hello, I tested this code on the platform and it works correctly. Unfortunately, it’s not clear from this snippet why the error occurs. Please check if you are using the latest versions of the packages and that your imports follow the documentation. https://docs.apify.com/sdk/python/reference/class/Request
Request | API | SDK for Python | Apify Documentation
Represents a request in the Crawlee framework, containing the necessary information for crawling operations.
The
Request
class is one of the core components in Crawlee, utilized by various components such as request
providers, HTTP clients, crawlers, and more. It encapsulates the essential data for executing web requests,
including the URL, H...