I am making a simple network in PyTorch with linear units as a practise project. I’d like to use sweeps to find the best hyper parameters for the network. Some of these hyperparameters include batch_norm, dropout value, number of hidden layers, number of units in each hidden layer.
I can’t figure out how to set up the model and sweep config so that two different model structures can be swept without being confusing. For example, I want to use batch_norm OR have dropout values of [0, 0.2, 0.4, 0.5]
. I never want batch_norm
AND dropout
to be used. If I use random search with wandb, it may choose both 0.4
dropout AND batch_norm
which I don’t want.
I know how to set up the network class with simple if statements so it adds either batch_norm
or dropout
, but the wandb.config
would still select a value for dropout
and a boolean for batch_norm
, and I don’t want the sweep report to show both these parameters if the network only uses one.
Another example is, I’d like 2, 3, 4 or 5 hidden layers. I’d also like each layer to have a randomly selected number of neurons from the range [64, 128, 256, 512]
.
I can forsee a problem where wandb will select the model to have 3 hidden layers but also pick say, 256 neurons for the 4th or 5th layer which will be misleading on the sweep parameter graph.