dual-salmonD
Apify & Crawlee12mo ago
5 replies
dual-salmon

crawlee not respecting cgroup resource limits

crawlee doesnt seem to respect resource limits imposed by cgroups. This poses problems for containerised enviroments where ethier crawlee gets oom killed or silently slows to a crawl as it thinks it has much more resource available then it actually does. reading and setting the maximum ram is pretty easy

function getMaxMemoryMB(): number | null {
  const cgroupPath = '/sys/fs/cgroup/memory.max';

  if (!existsSync(cgroupPath)) {
    log.warning('Cgroup v2 memory limit file not found.');
    return null;
  }

  try {
    const data = readFileSync(cgroupPath, 'utf-8').trim();
    
    if (data === 'max') {
      log.warning('No memory limit set (cgroup reports "max").');
      return null;
    }

    const maxMemoryBytes = parseInt(data, 10);
    return maxMemoryBytes / (1024 * 1024); // Convert to MB
  } catch (error) {
    log.exception(error as Error, 'Error reading cgroup memory limit:');
    return null;
  }
}

this can then be used to set a reasonable RAM limit for crawlee however, the CPU limits are proving more difficult. Has anyone found a fix yet?
Was this page helpful?