I am making a simple network in PyTorch with linear units as a practise project. I’d like to use sweeps to find the best hyper parameters for the network. Some of these hyperparameters include batch_norm, dropout value, number of hidden layers, number of units in each hidden layer.
I can’t figure out how to set up the model and sweep config so that two different model structures can be swept without being confusing. For example, I want to use batch_norm OR have dropout values of
[0, 0.2, 0.4, 0.5]. I never want
dropout to be used. If I use random search with wandb, it may choose both
0.4 dropout AND
batch_norm which I don’t want.
I know how to set up the network class with simple if statements so it adds either
dropout, but the
wandb.config would still select a value for
dropout and a boolean for
batch_norm, and I don’t want the sweep report to show both these parameters if the network only uses one.
Another example is, I’d like 2, 3, 4 or 5 hidden layers. I’d also like each layer to have a randomly selected number of neurons from the range
[64, 128, 256, 512].
I can forsee a problem where wandb will select the model to have 3 hidden layers but also pick say, 256 neurons for the 4th or 5th layer which will be misleading on the sweep parameter graph.