fair-rose
fair-rose12mo ago

Error adding request to the queue: Request ID does not match its unique_key.

Hi and good day. I'm creating a POST API that access the following JSON body: { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] } Inside the list of targets, is the extension which my code downloads if it discovers. I'm already at my wit's end since I don't get the error I'm getting which is: [crawlee.memory_storage_client._request_queue_client] WARN Error adding request to the queue: Request ID does not match its unique_key. Does anyone encountered this problem? The following is my whole code:
9 Replies
Hall
Hall12mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
MEE6
MEE612mo ago
@Nyanmaru just advanced to level 1! Thanks for your contributions! 🎉
fair-rose
fair-roseOP12mo ago
Hi, any developers that can help me?
Oleg V.
Oleg V.12mo ago
cc @Vlada Dusek 🙏
HonzaS
HonzaS12mo ago
I have never set explicitly id of the request, what is the purpose? I think it is colliding with some inner crawlee mechanism that is setting id automatically. You can set just the unique key.
Mantisus
Mantisus12mo ago
Hi @Nyanmaru I think you need to use
requests = [Request.from_url(
url=start_url,
user_data={"targets": targets},
unique_key=request_id
)]
requests = [Request.from_url(
url=start_url,
user_data={"targets": targets},
unique_key=request_id
)]
fair-rose
fair-roseOP12mo ago
I'm trying to make a POST request using FastAPI that accepts JSON body that contains the URL and the target extension in which I want to make flexible by inputting the extension I want to download, example on my JSON body I inserted { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] } my program will crawl on the url provided on the JSON body then it will download all files that has the extensions of html and pdf which I can add more like png or jpg
fair-rose
fair-roseOP12mo ago
Hi @Mantisus , your solution worked! Only problem is that the targets only stick to the 1st crawl then disappears on the next url
No description
fair-rose
fair-roseOP12mo ago
Hi @HonzaS, I'm actually trying to make a POST request using FastAPI that will accept a JSON Body that contains the URL and the target extension that I'll download every page that I'll crawl to. Example: on my JSON body I inserted { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] } my program will crawl on the url provided on the JSON body then it will download all files that has the extensions of html and pdf which I can add more like png or jpg Hi Everyone! Glad to say this finally worked! I've fixed the latest problem encountered by adding the following on my enqueue_links: await context.enqueue_links(user_data={"targets": targets}) Thank you all for those who answered! 😄

Did you find this page helpful?