fascinating-indigo
fascinating-indigo•2y ago

Use RabbitMQ as an alternative queue.

Hey guys, I tried a few ways, also read some experiments with the dialogue below. But the problem seems to me that Crawlee is quite slow at starting up or running pipelines. After I add the URL (uniqueKey) to requestQueue. It takes about 2-3 minutes for it to launch or more links have to be added to the queue for it to start. Can you guys help me? https://discord.com/channels/801163717915574323/1056348705407651941
2 Replies
fascinating-indigo
fascinating-indigoOP•2y ago
Some logs that I have
INFO PuppeteerCrawler: Starting the crawler.
{"fields":{"consumerTag":"amq.ctag-fnm1hOkyM-n6CJcBFKAVhA","deliveryTag":1,"redelivered":false,"exchange":"CRAWLER_REQUEST","routingKey":"NEW_REQUEST"},"properties":{"headers":{},"deliveryMode":2},"content":{"type":"Buffer","data":[123,34,117,114,108,34,58,48,125]}} - Received message: {"url":0}
done
INFO Statistics: PuppeteerCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":60010,"retryHistogram":[]}
INFO PuppeteerCrawler:AutoscaledPool: state {"currentConcurrency":0,"desiredConcurrency":1,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}
INFO PuppeteerCrawler: Starting the crawler.
{"fields":{"consumerTag":"amq.ctag-fnm1hOkyM-n6CJcBFKAVhA","deliveryTag":1,"redelivered":false,"exchange":"CRAWLER_REQUEST","routingKey":"NEW_REQUEST"},"properties":{"headers":{},"deliveryMode":2},"content":{"type":"Buffer","data":[123,34,117,114,108,34,58,48,125]}} - Received message: {"url":0}
done
INFO Statistics: PuppeteerCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":60010,"retryHistogram":[]}
INFO PuppeteerCrawler:AutoscaledPool: state {"currentConcurrency":0,"desiredConcurrency":1,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}
I've resolved it yet, it's related to some old instance Crawlee which was working.
MEE6
MEE6•2y ago
@Cau2tony just advanced to level 1! Thanks for your contributions! 🎉

Did you find this page helpful?