Hypyerparameter optimization with k folds on each iteration

aulrichsen · November 16, 2022, 3:55pm

I am trying to perform hyperparameter optimization with wandb and for each iteration I would like to get the average performance across 3 different folds of my dataset.

I have defined a function optimize that i pass to wandb.agent:

def optimize(config):
    for fold in range(1, 4):    
        dataset_artifact = f'fold-{fold}:latest'
        config['dataset_artifact'] = dataset_artifact 
        with wandb.init(config=config, group=group_name, job_type=f'train-fold-{fold}', name=f'train-fold-{fold}', reinit=True) as run:   
            train_and_log(config, run)  
            run.finish()

I would expect this to creat a seperate run for each fold (since I have specified a different job type and run name as well as passing init=True) so that I would end up with:

Group: param_combo_1

> Job Type: train-fold-1

> train-fold-1

> Job Type: train-fold-2

> train-fold-2

> Job Type: train-fold-3

> train-fold-3

However each run for a given hyperparameter iteration overwrites the previous fold so I in fact end up with

Group: param_combo_1

> Job Type: train-fold-3

> train-fold-3

How can I resolve this issue?

system · November 17, 2022, 3:09pm

Hi Alexander, thanks for writing in!

I can see the same behaviour as you do. After some investigation, I have realized that this is an intended one because of the fact that, with each combination of parameters it is created only one run (same run id), so it is only resuming the previous run although you use reinit=True. In terms of a workaround, I think there are two ways to solve this:

Average your metrics inside the optimize()/train_and_log() function in the same run instead of creating different runs.
Use the grid method instead of random and repeat some values (i.e. batch_size=[64,64,64,128,128,128]).
Please let me know if any of these would work for you or if you would like me to create a request for this feature (I was thinking something like a new argument in the agent like repeat=number_of_repetitions and average the results). If this is the case, I would really appreciate if you could give me some more details about your use-case and why this new feature would be useful for you. Thanks!

Best,
Luis

system · November 22, 2022, 10:28am

Hi Alexander,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Luis

aulrichsen · November 25, 2022, 11:57am

Hi Luis,

Those workarounds will probably be okay but I would very much like to see a feature allowing you to implement this behaviour with a parameter such as ‘repeat’ as you mentioned.

Many thanks,
Alexander

system · November 28, 2022, 10:49am

Hi Alexander,

Thanks for confirming this! I have created a request for this feature, thanks for suggesting it. May I help you in any other way?

Best,
Luis

system · January 24, 2023, 11:58am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Hyperparameter tuning combined with k-fold cross validation W&B Help sweeps	13	3216	June 9, 2023
Runs are overwritten when launched with wandb sweep W&B Help sweeps	0	28	February 26, 2025
Training metric names get changed during training iterations W&B Help	4	42	August 8, 2024
Sweep: force agents to run through the same sequence of hyperparameters on different machines W&B Help sweeps , wandb	4	842	December 19, 2023
Wandb is doing the same possibility multiple times W&B Help sweeps	4	503	March 5, 2022

Hypyerparameter optimization with k folds on each iteration

Related topics