Force Bayesian sweep to run certain variable tests

Hi!
I want to run a bayesian HP sweep with 5-fold CV. In other words I want the bayesian sweep to decide upon a configuration, run 5 runs with that configuration and log each run. The easiest way to do this would be to have a variable in the sweep, called e.g. fold_id which simply can take the values 1,2,3,4,5 and force the agent to always test all the fold_ids per configuration.

Is there any way to make this possible? I.e force the sweep agent to always test a variable, even though running a bayesian sweep. In a way it would be like running a grid sweep over a bayesian sweep.

One way I’ve thought of is by making all parameters nested inside the fold_id variable but it still won’t probably do what I’m after.

I’ve seen the k-fold CV example code, but it’s quite advanced and does not seem to work when running on CUDA and my understanding of multiprocessing is limited.

Thank you!

Another way I’ve though of is the following:

Use this pseudo trainer function (called by wandb.agent):

def trainer(config=None):
    # Initialize a new run with the sweep configuration
    wandb.init(config=config)

# Store the config parameters
    sweepconfig = wandb.config

# Load the raw data and split into k-folds
    data = pd.read_csv(...)
    folds = Make_KFolds().Split(...)

# Store the sweep run's name
    name = wandb.run.name
    
    sweep_metric = []
    for fold_id in range(1,  k_folds+1, 1):
        # init new run to store performance of this fold
        wandb.init(config=sweepconfig, name=f"{name+'-'+str(fold_id)}", group=name)
        wandb.config.update({'fold_id': fold_id})
        # Run the training function which logs metrics to the fold's run
        metric = RunTrainingEpochs(...)
        sweep_metric.append(metric)


# Exit loop, resume the sweep run and log the avg sweep metric as the average performance
    wandb.init(config=sweepconfig, name = name, resume=True)
    wandb.log({'Avg metric': sum(sweep_metric)/ k_folds})

But this does not work. In my opinion this should basically do the same as the kfold-CV example code. The sweep agent seems to be limited to one run per trainer call. Even though it initializes new runs, the previous run is continuously overwritten by the next wandb.init call.

You can create a nested sweep where each fold could be a list and then you can then iterate over those values. Make sure that the run name changes per run so that way the runs don’t overwrite one another.

Here’s an example config of a nested sweep:

command:
  - ${env}
  - python3
  - ${program}
  - ${args}
method: random
parameters:
  MULTI_STAGE_TRAINING:
    value:
      DEPTH_SCALE:
        - 100
        - 100
      HEAD:
        - OBJECT_DETECTION
      NETWORK:
        - net_a
        - net_b
        - net_c
      NUM_EPOCHS_IN_EACH_STAGE:
        - 0
        - 1
        - 2
        - 3
      NUM_STAGES:
        - 0
        - 1
        - 2
        - 3
        - 4
        - 5
        - 6
        - 7
        - 8
        - 9
      OPTIMIZER_PARAMS_PER_STAGE:
        lr:
          - 0
          - 1
          - 2
          - 3
          - 4
          - 5
          - 6
          - 7
          - 8
          - 9
        momentum:
          - 0
          - 1
          - 2
          - 3
          - 4
          - 5
          - 6
          - 7
          - 8
          - 9
        weight_decay:
          - 0
          - 1
          - 2
          - 3
          - 4
          - 5
          - 6
          - 7
          - 8
          - 9
  epochs:
    value: 10
program: script.py

And here’s a script that is able to run it:

import wandb
​
def create_sweep(
    sweep_config:dict,
    update:bool,
    project:str,
    entity:str):
    
    parameters_dict = {'MULTI_STAGE_TRAINING':
                   {'value':
                    {'NUM_STAGES':list(range(10)),
                     'OPTIMIZER_PARAMS_PER_STAGE':
                     {'lr':list(range(10)),'momentum': list(range(10)),'weight_decay':list(range(10))},
                     'NUM_EPOCHS_IN_EACH_STAGE':list(range(4)),
                     'NETWORK':['net_a','net_b','net_c'],
                     'HEAD':['OBJECT_DETECTION'],
                     'DEPTH_SCALE': [100,100]
                     }
                    }
                   }
    sweep_config['parameters'] = parameters_dict
    
    parameters_dict.update({
    'epochs': {
        'value': 10}
    })
    return wandb.sweep(sweep_config,entity=entity,project=project)
​
if __name__ == '__main__':
​
    SWEEP_CONFIG = {
    'method': 'random',
    'program':'script.py',
    'command':['${env}', 'python3', '${program}','${args}']
    }
    ENTITY = 'demonstrations'
    PROJECT = 'sweep_gm'
    UPDATE = True
​
    sweep = create_sweep(
        sweep_config=SWEEP_CONFIG,
        entity=ENTITY,
        project=PROJECT,
        update=UPDATE)

Let me know if you need any further help with this!

Do you need any help here still?