strock77
strock774h ago

Safe to parallelize Dataset writes across processes?

Context: • Crawlee v3.13.10, Node 22 • Linux (ext4), using storage-local • Multiple forked workers share one RequestQueueV2 (with request locking) • Each worker does:
const dataset = await Dataset.open('default');
await dataset.pushData(item);
const dataset = await Dataset.open('default');
await dataset.pushData(item);
Is it safe for N processes to push to the same dataset concurrently with storage-local (no corruption/partial writes)? Any guarantees about atomicity / ordering? If not recommended, what’s the best pattern? • per-worker datasets then merge? Any flags/settings I should use to make this robust? Thanks!
0 Replies
No replies yetBe the first to reply to this messageJoin

Did you find this page helpful?