optimistic-gold
optimistic-gold15mo ago

Error adding request to the queue: Request ID does not match its unique_key.

Hi and good day. I'm creating a POST API that access the following JSON body: { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] } Inside the list of targets, is the extension which my code downloads if it discovers. I'm already at my wit's end since I don't get the error I'm getting which is: [crawlee.memory_storage_client._request_queue_client] WARN Error adding request to the queue: Request ID does not match its unique_key. Does anyone encountered this problem? The following is my whole code:
9 Replies
Hall
Hall15mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
MEE6
MEE615mo ago
@Nyanmaru just advanced to level 1! Thanks for your contributions! 🎉
optimistic-gold
optimistic-goldOP15mo ago
Hi, any developers that can help me?
Oleg V.
Oleg V.15mo ago
cc @Vlada Dusek 🙏
HonzaS
HonzaS15mo ago
I have never set explicitly id of the request, what is the purpose? I think it is colliding with some inner crawlee mechanism that is setting id automatically. You can set just the unique key.
Mantisus
Mantisus15mo ago
Hi @Nyanmaru I think you need to use
requests = [Request.from_url(
url=start_url,
user_data={"targets": targets},
unique_key=request_id
)]
requests = [Request.from_url(
url=start_url,
user_data={"targets": targets},
unique_key=request_id
)]
optimistic-gold
optimistic-goldOP15mo ago
I'm trying to make a POST request using FastAPI that accepts JSON body that contains the URL and the target extension in which I want to make flexible by inputting the extension I want to download, example on my JSON body I inserted { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] } my program will crawl on the url provided on the JSON body then it will download all files that has the extensions of html and pdf which I can add more like png or jpg
optimistic-gold
optimistic-goldOP15mo ago
Hi @Mantisus , your solution worked! Only problem is that the targets only stick to the 1st crawl then disappears on the next url
No description
optimistic-gold
optimistic-goldOP15mo ago
Hi @HonzaS, I'm actually trying to make a POST request using FastAPI that will accept a JSON Body that contains the URL and the target extension that I'll download every page that I'll crawl to. Example: on my JSON body I inserted { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] } my program will crawl on the url provided on the JSON body then it will download all files that has the extensions of html and pdf which I can add more like png or jpg Hi Everyone! Glad to say this finally worked! I've fixed the latest problem encountered by adding the following on my enqueue_links: await context.enqueue_links(user_data={"targets": targets}) Thank you all for those who answered! 😄

Did you find this page helpful?