xenial-black
xenial-black•3y ago

Owerwriting old values in datastorage

Hi! Is there any way how to just overwrite changed values in named storage, or remove all items before the new items are added from the run? Thanks!
9 Replies
MEE6
MEE6•3y ago
@Pepa just advanced to level 1! Thanks for your contributions! 🎉
Pepa J
Pepa J•3y ago
Hello @Pepa , You may drop the whole dataset via dataset.drop(), https://docs.apify.com/sdk/js/docs/2.3/api/dataset#datasetdrop then you may to create it again as empty.
Dataset | Apify Documentation
The Dataset class represents a store for structured data where each object stored has the same attributes, such as online store products or real
xenial-black
xenial-blackOP•3y ago
Hello @Pepa J thanks for your reply. That works well on localhost with crawlee, on Apify platform I see only "pageFunction" and navigationHooks, I would set a condition to run it only before the first url, but I can't even acess Actor to assign the dataset from datasetName.
Pepa J
Pepa J•3y ago
@Pepa So you are just using some already made actor (like puppeteer crawler) and have limited access to its functions since you can only change its input?
xenial-black
xenial-blackOP•3y ago
@Pepa J Yes, the Cheerio Crawler.
Pepa J
Pepa J•3y ago
So simple hack could be
if (!global.first) {
log.info('called only once');
global.first = true;
}
if (!global.first) {
log.info('called only once');
global.first = true;
}
It is not perfect, and may delete all your data in case of migration. How do you store the data to the external dataset? Edit: Oh I see it is part of the input. And you probably cannot drop the dataset anyway, because you are out of context for Dataset. hmm...
xenial-black
xenial-blackOP•3y ago
Finally I'll keep the incremental store and handle it after. Probably the only solution would be to create a custom actor based on the Cheerio Crawler
Pepa J
Pepa J•3y ago
You may create just a very simple actor that would drop the dataset and then automatically run your already created and configured cheerio-scraper actor.
HonzaS
HonzaS•3y ago
just one note, if you drop the dataset and then create a new one, it will have different url as new id will be used for the new one

Did you find this page helpful?