Apify Discord Mirror

Updated 5 months ago

isFinishedFunction, check other crawler?

At a glance
The community member is trying to modify the isFinishedFunction in the autoscaledPoolOptions to check if both the current request_queue and a separate web_crawler_queue are finished. The original implementation worked sometimes, but the crawler would get stuck and throw a "stalled" error. Another community member suggests "monkey-patching" the original isFinishedFunction to add the additional check for the web_crawler_queue. However, the community member notes that the modified code still doesn't work all the time, and they are seeing logs indicating system overload issues.
Hello,

two questions.

Is there a way to call this.isFinishedFunction so it calls the original function but also just add another web_crawler_queue isFinished on top of it? the uncommented out function I tried, worked somewhat, but, after a long running web_crawler_queue finished, it just kept giving me a stalled error, and, this crawler that this function belongs to never finished.



Plain Text
    autoscaledPoolOptions: {
      isFinishedFunction: async () => {
        const web_crawler_queue = await RequestQueue.open(place_id)
//        return this.isFinishedFunction() && await web_crawler_queue.isFinished()
        return await request_queue.isFinished() && await web_crawler_queue.isFinished()

      }
    },
A
L
b
8 comments
just advanced to level 3! Thanks for your contributions! πŸŽ‰
There is no original function since you are putting your own there in options. If you would want to use original, you would have to monkeypatch it after defining.

Plain Text
// not sure about the .bind
const origFn = crawler.autoscaledPool.isFinishedFunction.bind(crawler)
crawler.autoscaledPool.isFinishedFunction = async () => {
    const orig = origFn()
    // etc.
}
got it, will try this
my code above almost works but for some reason doesn't all the time..
which is basically is my current request_queue finished and is my other crawler finished
but sometimes it doesn't pick up and I'll get an log along the lines of : "x, y, z ahs been stalled for 350 seconds"
Plain Text
{
  "isSystemIdle": false,
  "memInfo": {
    "isOverloaded": true,
    "limitRatio": 0.2,
    "actualRatio": 1
  },
  "eventLoopInfo": {
    "isOverloaded": false,
    "limitRatio": 0.6,
    "actualRatio": 0.057
  },
  "cpuInfo": {
    "isOverloaded": true,
    "limitRatio": 0.4,
    "actualRatio": 1
  },
  "clientInfo": {
    "isOverloaded": false,
    "limitRatio": 0.3,
    "actualRatio": 0
  }
}
this: RNING","msg":"RequestQueue: The request queue seems to be stuck for 370s, resetting internal state.","inProgress":[]}
Add a reply
Sign up and join the conversation on Discord