Sweep: force agents to run through the same sequence of hyperparameters on different machines

jcnumeus · October 16, 2023, 11:30am

Hello,

i am using multiple machines to parallelize a crossvalidation over the folds. now, inside my fold training function, i have implemented a wandb sweep. what i would like to do is to start an agent on each machine. each machine will have a different fold of training / valid / test data, but i want them to sweep through the exact same sequences of hyperparameters and write their outputs to the same sweep_id which i impose from outside. in this way, i can later on group the different cv runs together for the same hyperparameter sets. the issue is that i havent really found a good way to force the agents to run through the same sequences of hyperparameters. i guess, by definition, the whole reason why agents are orchestrated by the same sweep controller is that they should distribute the work of testing different hyperparameter sets, instead of all computing the same stuff over and over again. so, maybe this functionality of enforcing the same hyperparameter combinations on different agents is not even there?
one solution would be to enforce method=grid, but i would also be able to use for instance random search. another solution would be to somehow make the k_fold of the CV a hyperparameter, but then, with methods other than grid, again, i have no guarantee that all k_fold values will be tested.
it would be so cool to have a functionality to enforce the same random seed with redundant computation on agents that write to the same sweep.

carlo-catimbang · October 19, 2023, 12:45am

Hi Jan-Christian,

Thank you for reaching out. I’ll be glad to assist you with this. We’ll investigate this and will be reaching back to you for updates.

Regards,
Carlo Argel

mohammadbakir · October 20, 2023, 6:46am

Hi @jcnumeus , please see this example of how to perform traditional k-fold cross validation using sweeps, which is the current support approach.

You are correct in that you can’t force multiple/same agents to reproduce the same set of hyperparemeter sequences. Our internal sweep controller lines up hyperparmeters to test and feeds them to the agent(s). There is an open feature for setting random sweep seeds, however, this isn’t being considered at this time with our sweeps team. If this changes we’ll be sure to make an announcement. Please let us know if you have any other questions.

jcnumeus · October 20, 2023, 7:48am

Hello,

thanks for your replies! So, in that CV example, it looks similar to what im currently doing. I dont fully understand whether it executes the same parameter sets for each fold. That would be exactly what i need to do. Im thinking, couldn’t this also be done by calling wandb.config a number of times (i.e. the number of grid points i want to search) beforehand and appending all those configs as dicts to a queue? Then, i could always send the same config to the tasks running for different folds and overwrite the config at the end of the run with wandb.config.update. Something along the lines of:

# task on remote machine
def remote_task(configs,k):
 
  def train(config,k):
     with wandb.init():
       config = configs.pop()
       # load data for fold k 
       # train model 
       config['k'] = k 
       wandb.config.update(config)

  wandb.agent(sweep_id, train, count=len(configs))

def main():
  # before starting the remote workers... 
  sweep_id = wandb.sweep(sweep_config)
  configs = []
  def get_config():
    configs.append(wandb.config)

  # this should populate my list of configs 
  wandb.agent(sweep_id, get_config, count=50)

  # start up remote tasks: one task per k fold that trains all configs sequentially
  for k in range(k_folds):
    remote.map(remote_task, (configs, k))

wouldnt that in a hacky way enforce running the same beforehand generated configs across each fold? i could even add ‘k’:[None] to my sweep_config, such that it appears in the sweep parameter plots. and then, due to the config overwrite it should be correctly displayed as an integer in the plot?
not sure whether i’m missing anything, here. maybe some of the stats & plots on the wandb dashboard get messed up if i do it like that?

system · December 19, 2023, 7:49am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multithreading support for Sweeps W&B Help sweeps , wandb	10	1357	January 1, 2024
Hyperparameter tuning combined with k-fold cross validation W&B Help sweeps	13	3217	June 9, 2023
Force Bayesian sweep to run certain variable tests W&B Help sweeps	5	450	November 14, 2022
Will each agent always use the same seed? W&B Help sweeps , wandb	4	343	June 5, 2023
Multiprocessing mp wandb sweeps and the count parameter, how to do sweeps with mp? W&B Help sweeps	6	466	June 3, 2024

Sweep: force agents to run through the same sequence of hyperparameters on different machines

Related topics