Typically pre wandb my approach to organizing dataset was to have lots of subfolders -
mnist
     complete
          augmented-mild
          augmented-heavy
     sampled-examples
          mnist-1000
               augmented-mild
               augmented-heavy
          mnist-10k
              augmented-mild
              augmented-heavy
   sampled-class-examples
        mnist-1000-5cls
        mnist-10k-5cls
On going through wandb artifacts docs, it seems it is best to have a flattened structure for dataset versioning. How much flattening is ideal? A complete flattening would mean each of those above to have a different name and same type(say “balanced-dataset”).Completely flattening dataset hierarchy seems to take away the “versioning” ability of wandb as now all of them are different artifacts.
