ambitious-aqua
ambitious-aqua•3y ago

requestQueue doesn't delete requests after visiting and saving data

Hi, working with crawlee and playwright. I've noticed that requests aren't being popped out of the queue even though the links have already been visited and scraped. Am I missing a configuration or something?
10 Replies
ambitious-aqua
ambitious-aquaOP•3y ago
The queue looks like this even after all of these requests have already been visited:
ambitious-aqua
ambitious-aquaOP•3y ago
No description
ambitious-aqua
ambitious-aquaOP•3y ago
my default router ( I have one other router for the DETAILS request but it does not enqueue links) :
ambitious-aqua
ambitious-aquaOP•3y ago
No description
ambitious-aqua
ambitious-aquaOP•3y ago
it also seems like it dumps the requests back to the queue after scraping...
ambitious-aqua
ambitious-aquaOP•3y ago
No description
ambitious-aqua
ambitious-aquaOP•3y ago
follow up: I'm dumb and in one case there wasn't a <p> element so the selector wasn't finding anything which is why it was failing.
MEE6
MEE6•3y ago
@tegra just advanced to level 1! Thanks for your contributions! 🎉
Alexey Udovydchenko
Alexey Udovydchenko•3y ago
Its a feature, requests stored per-run to process unique URL just once, this way you can add all sublinks from web site without going into endless loop of scraping
ambitious-aqua
ambitious-aquaOP•3y ago
thanks, it took me awhile but I read the documentation and realized that was the case!

Did you find this page helpful?