Dataset selection for hyperparameter optimization and training

i want to make a multiclass classifier using a bert model. For this i would like to compare the performance of (at least) two domain specific bert models. But before i compare the model performance i would like to find the best hyperparameters using wandb sweeps und the simpletransformers api (the simpletransformers api, has an easy integration with wandb).

Currently i’m a bit confused how to select a good dataset for

  1. the hyperparameter optimization
  2. the training with the best hyperparams.

So for the hyperparams, should i create n cross-validation sets and then run a training cycle with the current selected hyperparams for every m in n dataset?
E.g. i created 2 train/test sets and i only want to find the best n of episodes out of [1,2]:
For both train/test sets, the training is done for 1 episode and in the nex cycle for 2 episodes?

And if i found the best hyperparameters, should i train the final model afterwards using my full dataset?

Hope my questions are kind of clear

Hi @simonkleinfeld!

Thank you for writing in! The W&B Help channel is usually meant for support with W&B issues, you would probably get a better response on the community channel : Show the Community! - W&B Community.

In any case, I’ll take a stab at helping here : A good dataset for your model would be the same (or similar) to the dataset you plan to finally use for training and inference. Good hyperparameters are usually dataset dependent.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.