strock77•4h ago

Safe to parallelize Dataset writes across processes?

Context: • Crawlee v3.13.10, Node 22 • Linux (ext4), using storage-local • Multiple forked workers share one RequestQueueV2 (with request locking) • Each worker does:

const dataset = await Dataset.open('default');
await dataset.pushData(item);

const dataset = await Dataset.open('default');
await dataset.pushData(item);

Is it safe for N processes to push to the same dataset concurrently with storage-local (no corruption/partial writes)? Any guarantees about atomicity / ordering? If not recommended, what’s the best pattern? • per-worker datasets then merge? Any flags/settings I should use to make this robust? Thanks!

0 Replies

No replies yetBe the first to reply to this messageJoin

Safe to parallelize Dataset writes across processes?

Did you find this page helpful?