Dataset selection for hyperparameter optimization and training

i want to make a multiclass classifier using a bert model. For this i would like to compare the performance of (at least) two domain specific bert models. But before i compare the model performance i would like to find the best hyperparameters using wandb sweeps und the simpletransformers api (the simpletransformers api, has an easy integration with wandb).

Currently i’m a bit confused how to select a good dataset for

  1. the hyperparameter optimization
  2. the training with the best hyperparams.

So for the hyperparams, should i create n cross-validation sets and then run a training cycle with the current selected hyperparams for every m in n dataset?
E.g. i created 2 train/test sets and i only want to find the best n of episodes out of [1,2]:
For both train/test sets, the training is done for 1 episode and in the nex cycle for 2 episodes?

And if i found the best hyperparameters, should i train the final model afterwards using my full dataset?

Hope my questions are kind of clear

Hi @simonkleinfeld!

Thank you for writing in! The W&B Help channel is usually meant for support with W&B issues, you would probably get a better response on the community channel : Show the Community! - W&B Community.

In any case, I’ll take a stab at helping here : A good dataset for your model would be the same (or similar) to the dataset you plan to finally use for training and inference. Good hyperparameters are usually dataset dependent.