FoudreTower
FoudreTower11mo ago

crawlee.run only scrap the first URL

Hi my problem is crawler.run(['https://keepa.com/#!product/4-B07GS6ZB7T', 'https://keepa.com/#!product/4-B0BZSWWK48']) only scrap the first URL I think this is because crawlee think they are the same URL , if i replace the "#" with a "?" it works , is there any way to make it work with url like this ?
Keepa.com - Amazon Price Tracker
Amazon price history charts, price drop alerts, price watches, daily drops and browser extensions.
3 Replies
Hall
Hall11mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
Lukas Celnar
Lukas Celnar11mo ago
Hi @FoudreTower The #! fragment are used for client-side navigation only. So the crawler sees these as duplicates. When you you change it for ? its no longer the hashtag fragment and the crawlee takes the whole url when deduping. One way around this would be to to add uniqueKey when enqueuing.
FoudreTower
FoudreTowerOP11mo ago
thanks @Lukas Celnar it works with uniquekey

Did you find this page helpful?