Output schema for actor quality score is a mistake?

The actor quality score takes the presence of an output schema into consideration (one of the suggested improvements). I think this is a mistake. Most (of my) actors use the default dataset store, technically not requiring an output schema. I assume that the intention is probably to check for the presence of a dataset schema and not output schema. Is that correct? This is at least what the description seems to imply. "Help users quickly see what data your Actor produces. Without an output schema, users see raw data that's hard to interpret. Add a schema to display results in organized tables with clear field names and types. All Actors should have one defined, even if they don't produce a dataset. Link: https://docs.apify.com/platform/actors/development/actor-definition/output-schema?fpr=7p4wu " This description implies dataset schema (which would make sense), but the title and link reference the output schema. These are two different things. - Output schema: specification on where data is stored. - Dataset schema: description of the fields in the dataset
Actor output schema | Platform | Apify Documentation
Learn how to define and present output of your Actor.
2 Replies
Matous
Matous3w ago
Hey @Louis Deconinck, I don't have much experience with those yet, but I suggest you provide both schemas. They can be very minimal, but once you implement them, the algorithm should give you a better rating... A bit of context: This is still a new feature so we don't know yet what should be enforced more or less, but those schemas are another new features that we want users to use, so we can push them a bit like this -> unfortunately even if it doesn't always make sense to use them they will be "required" for the simplicity.
xMiso
xMiso2w ago
I think point is with output.json you can tidy and beautify the output user sees, eg you can have separate tabs showing different fields, decide order of fields etc..It has nothing with dataset store, but with how are data presented. But sure it makes sense only for users using visual output, if somebody grabs raw data as json via api, or downloads csv then output.json is useless

Did you find this page helpful?