Typically pre wandb my approach to organizing dataset was to have lots of subfolders -
mnist
complete
augmented-mild
augmented-heavy
sampled-examples
mnist-1000
augmented-mild
augmented-heavy
mnist-10k
augmented-mild
augmented-heavy
sampled-class-examples
mnist-1000-5cls
mnist-10k-5cls
On going through wandb artifacts docs, it seems it is best to have a flattened structure for dataset versioning. How much flattening is ideal? A complete flattening would mean each of those above to have a different name and same type(say “balanced-dataset”).Completely flattening dataset hierarchy seems to take away the “versioning” ability of wandb as now all of them are different artifacts.